Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4069597imu; Mon, 28 Jan 2019 16:41:08 -0800 (PST) X-Google-Smtp-Source: ALg8bN6pRFGooT1XjKBT/Uhae7eaoLeC/lXIqjfaPqwqA+xJMcYIhd9ExffWIE8XZYSvguAzhKpl X-Received: by 2002:a63:8ac4:: with SMTP id y187mr21926674pgd.446.1548722468368; Mon, 28 Jan 2019 16:41:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548722468; cv=none; d=google.com; s=arc-20160816; b=mHi9g5Hc3c+PCO4ZHF4hgmLWSN34tfnFY1VzGq6wMzox4XWSf7s3Qv55hHLPQTJbsm 70UaBA2lPpqEcfDdVGGgwLLC+sEpLWkFpZA0pY0QV2Rvmi1DlJq4GMjTJ1VDGQ2SOLEK 499yzdjSsswJNH6Zg0sKo2CnA/druUa/JEYoOFlIgO8vdJyUNoGwbVrb2LtmQrIVu9sF mLB1YCumsP6VmSN9f1IN6z1xJujgmVJwn5IsdN4qSF0GfBcZCV5OgDpWja6B+3jJiKQN yFV3X2te/uK4P+OqeTvaLzFdnd3Qb+3wa+tp+9sLpmPtLmhW8iHYoa7vHx6xQQ3Ac36D 6kkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=DKNSNx9TC+YmyW1Pv02PcHIIet+m2H/do1xAqKXpM3I=; b=wN5ORBZS3B9sSKYatEn9HesrbpczFxMWQpAkGlfZXKsADt8MgBzgGbmCc7TZl24eVe rxRS0JIUpenO1+P2fY41Hjc3ChlVopxeGG7/wsu9dQSGZ1rJS/zaHtk+U/+b4M+xZmWp zAURxo/Adjwy2CfuOILSP82GmAaS3GfwL6Vo59er73rOp5onT9jdfsG3TNGybnRWMU6K e0q3pMQE8Iby7khWsMRCVmSVR8PrnF4zthZouptyKRrs9DcEH1n0hsJzY2+QumbDx0w1 qqRjfZMktXcW2GSAKZ+SDikUmg/hnnm03Ojk43LLKKCe+4ectAu63nXKajH4bAdWRW8D cXCA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x64si32641990pfx.87.2019.01.28.16.40.52; Mon, 28 Jan 2019 16:41:08 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727665AbfA2AjW (ORCPT + 99 others); Mon, 28 Jan 2019 19:39:22 -0500 Received: from mga17.intel.com ([192.55.52.151]:47179 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727267AbfA2AjO (ORCPT ); Mon, 28 Jan 2019 19:39:14 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Jan 2019 16:39:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,535,1539673200"; d="scan'208";a="133921934" Received: from rpedgeco-desk5.jf.intel.com ([10.54.75.79]) by orsmga001.jf.intel.com with ESMTP; 28 Jan 2019 16:39:12 -0800 From: Rick Edgecombe To: Andy Lutomirski , Ingo Molnar Cc: linux-kernel@vger.kernel.org, x86@kernel.org, hpa@zytor.com, Thomas Gleixner , Borislav Petkov , Nadav Amit , Dave Hansen , Peter Zijlstra , linux_dti@icloud.com, linux-integrity@vger.kernel.org, linux-security-module@vger.kernel.org, akpm@linux-foundation.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, will.deacon@arm.com, ard.biesheuvel@linaro.org, kristen@linux.intel.com, deneen.t.dock@intel.com, Rick Edgecombe , Jessica Yu , Steven Rostedt Subject: [PATCH v2 16/20] modules: Use vmalloc special flag Date: Mon, 28 Jan 2019 16:34:18 -0800 Message-Id: <20190129003422.9328-17-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190129003422.9328-1-rick.p.edgecombe@intel.com> References: <20190129003422.9328-1-rick.p.edgecombe@intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Use new flag for handling freeing of special permissioned memory in vmalloc and remove places where memory was set RW before freeing which is no longer needed. Since vfreeing of VM_HAS_SPECIAL_PERMS memory is not supported in an interrupt by vmalloc, the freeing of init sections is moved to a work queue. Instead of call_rcu it now uses synchronize_rcu() in the work queue. Lastly, there is now a WARN_ON in module_memfree since it should not be called in an interrupt with special memory as is required for VM_HAS_SPECIAL_PERMS. Cc: Jessica Yu Cc: Steven Rostedt Signed-off-by: Rick Edgecombe --- kernel/module.c | 77 +++++++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 38 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index ae1b77da6a20..1af5c8e19086 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -98,6 +98,10 @@ DEFINE_MUTEX(module_mutex); EXPORT_SYMBOL_GPL(module_mutex); static LIST_HEAD(modules); +/* Work queue for freeing init sections in success case */ +static struct work_struct init_free_wq; +static struct llist_head init_free_list; + #ifdef CONFIG_MODULES_TREE_LOOKUP /* @@ -1949,6 +1953,8 @@ void module_enable_ro(const struct module *mod, bool after_init) if (!rodata_enabled) return; + set_vm_special(mod->core_layout.base); + set_vm_special(mod->init_layout.base); frob_text(&mod->core_layout, set_memory_ro); frob_text(&mod->core_layout, set_memory_x); @@ -1972,15 +1978,6 @@ static void module_enable_nx(const struct module *mod) frob_writable_data(&mod->init_layout, set_memory_nx); } -static void module_disable_nx(const struct module *mod) -{ - frob_rodata(&mod->core_layout, set_memory_x); - frob_ro_after_init(&mod->core_layout, set_memory_x); - frob_writable_data(&mod->core_layout, set_memory_x); - frob_rodata(&mod->init_layout, set_memory_x); - frob_writable_data(&mod->init_layout, set_memory_x); -} - /* Iterate through all modules and set each module's text as RW */ void set_all_modules_text_rw(void) { @@ -2024,23 +2021,8 @@ void set_all_modules_text_ro(void) } mutex_unlock(&module_mutex); } - -static void disable_ro_nx(const struct module_layout *layout) -{ - if (rodata_enabled) { - frob_text(layout, set_memory_rw); - frob_rodata(layout, set_memory_rw); - frob_ro_after_init(layout, set_memory_rw); - } - frob_rodata(layout, set_memory_x); - frob_ro_after_init(layout, set_memory_x); - frob_writable_data(layout, set_memory_x); -} - #else -static void disable_ro_nx(const struct module_layout *layout) { } static void module_enable_nx(const struct module *mod) { } -static void module_disable_nx(const struct module *mod) { } #endif #ifdef CONFIG_LIVEPATCH @@ -2120,6 +2102,11 @@ static void free_module_elf(struct module *mod) void __weak module_memfree(void *module_region) { + /* + * This memory may be RO, and freeing RO memory in an interrupt is not + * supported by vmalloc. + */ + WARN_ON(in_interrupt()); vfree(module_region); } @@ -2171,7 +2158,6 @@ static void free_module(struct module *mod) mutex_unlock(&module_mutex); /* This may be empty, but that's OK */ - disable_ro_nx(&mod->init_layout); module_arch_freeing_init(mod); module_memfree(mod->init_layout.base); kfree(mod->args); @@ -2181,7 +2167,6 @@ static void free_module(struct module *mod) lockdep_free_key_range(mod->core_layout.base, mod->core_layout.size); /* Finally, free the core (containing the module structure) */ - disable_ro_nx(&mod->core_layout); module_memfree(mod->core_layout.base); } @@ -3424,17 +3409,34 @@ static void do_mod_ctors(struct module *mod) /* For freeing module_init on success, in case kallsyms traversing */ struct mod_initfree { - struct rcu_head rcu; + struct llist_node node; void *module_init; }; -static void do_free_init(struct rcu_head *head) +static void do_free_init(struct work_struct *w) { - struct mod_initfree *m = container_of(head, struct mod_initfree, rcu); - module_memfree(m->module_init); - kfree(m); + struct llist_node *pos, *n, *list; + struct mod_initfree *initfree; + + list = llist_del_all(&init_free_list); + + synchronize_rcu(); + + llist_for_each_safe(pos, n, list) { + initfree = container_of(pos, struct mod_initfree, node); + module_memfree(initfree->module_init); + kfree(initfree); + } } +static int __init modules_wq_init(void) +{ + INIT_WORK(&init_free_wq, do_free_init); + init_llist_head(&init_free_list); + return 0; +} +module_init(modules_wq_init); + /* * This is where the real work happens. * @@ -3511,7 +3513,6 @@ static noinline int do_init_module(struct module *mod) #endif module_enable_ro(mod, true); mod_tree_remove_init(mod); - disable_ro_nx(&mod->init_layout); module_arch_freeing_init(mod); mod->init_layout.base = NULL; mod->init_layout.size = 0; @@ -3522,14 +3523,18 @@ static noinline int do_init_module(struct module *mod) * We want to free module_init, but be aware that kallsyms may be * walking this with preempt disabled. In all the failure paths, we * call synchronize_rcu(), but we don't want to slow down the success - * path, so use actual RCU here. + * path. We can't do module_memfree in an interrupt, so we do the work + * and call synchronize_rcu() in a work queue. + * * Note that module_alloc() on most architectures creates W+X page * mappings which won't be cleaned up until do_free_init() runs. Any * code such as mark_rodata_ro() which depends on those mappings to * be cleaned up needs to sync with the queued work - ie * rcu_barrier() */ - call_rcu(&freeinit->rcu, do_free_init); + if (llist_add(&freeinit->node, &init_free_list)) + schedule_work(&init_free_wq); + mutex_unlock(&module_mutex); wake_up_all(&module_wq); @@ -3826,10 +3831,6 @@ static int load_module(struct load_info *info, const char __user *uargs, module_bug_cleanup(mod); mutex_unlock(&module_mutex); - /* we can't deallocate the module until we clear memory protection */ - module_disable_ro(mod); - module_disable_nx(mod); - ddebug_cleanup: ftrace_release_mod(mod); dynamic_debug_remove(mod, info->debug); -- 2.17.1