Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965014AbcJTQST (ORCPT ); Thu, 20 Oct 2016 12:18:19 -0400 Received: from mail-lf0-f54.google.com ([209.85.215.54]:35073 "EHLO mail-lf0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932817AbcJTQSR (ORCPT ); Thu, 20 Oct 2016 12:18:17 -0400 From: Aaron Tomlin To: linux-kernel@vger.kernel.org Cc: rusty@rustcorp.com.au, rostedt@goodmis.org Subject: [RFC PATCH 0/2] Possible race between load_module() error handling and kprobe registration ? Date: Thu, 20 Oct 2016 17:18:11 +0100 Message-Id: <1476980293-19062-1-git-send-email-atomlin@redhat.com> X-Mailer: git-send-email 2.5.5 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3606 Lines: 88 I think there is a race (albeit a hard-to-hit one) between load_module() error handling and kprobe registration which could cause a kernel page to become read-only, panic due to protection fault. In short, the protection that gets applied [at the bug_cleanup label] can be overridden by another CPU when executing set_all_modules_text_ro(). Therefore creating the possibility for the kprobe registration code path to touch a [formed] module that is being deallocated. Consequently we could free a mapped page, that is not 'writable'. The same page, when later accessed, will result in a page fault which cannot be handled. Below is an attempt to illustrate the race. Please note we assume that: - kprobe uses ftrace - parse_args() or mod_sysfs_setup() would have to fail - CPU Y and X do not attempt to load the same module - CPU Y would have to sneak in *after* CPU X called the two 'unset' functions but before CPU X removes the module from the list of all modules CPU X ... load_module // Unknown/invalid module // parameter specified ... after_dashes = parse_args(...) if (IS_ERR(after_dashes)) err = PTR_ERR(after_dashes) goto coming_cleanup: ... bug_cleanup: module_disable_ro(mod) module_disable_nx(mod) ... // set_all_modules_text_ro() on CPU Y sneaks in here <-----. // and overrides the effects of the previous 'unset' | ... | list_del_rcu(&mod->list) | | | CPU Y | ... | sys_finit_module | load_module | do_init_module | do_one_initcall | // mod->init | kprobe_example_init | register_kprobe | arm_kprobe | // kprobe uses ftrace | arm_kprobe_ftrace | register_ftrace_function | ftrace_startup | ftrace_startup_enable | ftrace_run_update_code | ftrace_arch_code_modify_post_process | { | // | // Set all [formed] module's | // core and init pages as | // read-only under | // module_mutex ... | // | set_all_modules_text_ro() ---------' } The following patches (I hope) is an attempt to address this theoretical race. Please let me know your thoughts. Aaron Tomlin (2): module: Ensure a module's state is set accordingly during module coming cleanup code module: When modifying a module's text ignore modules which are going away too kernel/module.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) -- 2.5.5