Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932758AbcLHTtv (ORCPT ); Thu, 8 Dec 2016 14:49:51 -0500 Received: from mail.kernel.org ([198.145.29.136]:46208 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932766AbcLHTts (ORCPT ); Thu, 8 Dec 2016 14:49:48 -0500 From: "Luis R. Rodriguez" To: shuah@kernel.org, jeyu@redhat.com, rusty@rustcorp.com.au, ebiederm@xmission.com, dmitry.torokhov@gmail.com, acme@redhat.com, corbet@lwn.net Cc: martin.wilck@suse.com, mmarek@suse.com, pmladek@suse.com, hare@suse.com, rwright@hpe.com, jeffm@suse.com, DSterba@suse.com, fdmanana@suse.com, neilb@suse.com, linux@roeck-us.net, rgoldwyn@suse.com, subashab@codeaurora.org, xypron.glpk@gmx.de, keescook@chromium.org, atomlin@redhat.com, mbenes@suse.cz, paulmck@linux.vnet.ibm.com, dan.j.williams@intel.com, jpoimboe@redhat.com, davem@davemloft.net, mingo@redhat.com, akpm@linux-foundation.org, torvalds@linux-foundation.org, linux-kselftest@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, "Luis R. Rodriguez" Subject: [RFC 10/10] kmod: add a sanity check on module loading Date: Thu, 8 Dec 2016 11:49:30 -0800 Message-Id: <20161208194930.2816-1-mcgrof@kernel.org> X-Mailer: git-send-email 2.10.2 In-Reply-To: <20161208184801.1689-1-mcgrof@kernel.org> References: <20161208184801.1689-1-mcgrof@kernel.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8864 Lines: 253 kmod has an optimization in place whereby if a some kernel code uses request_module() on a module already loaded we never bother userspace as the module already is loaded. This is not true for get_fs_type() though as it uses aliases. Additionally kmod <= v19 was broken -- it returns 0 to modprobe calls, assuming the kernel module is built-in, where really we have a race as the module starts forming. kmod <= v19 has incorrect userspace heuristics, a userspace kmod fix is available for it: http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4 This changes kmod to address both: o Provides the alias optimization for get_fs_type() so modules already loaded do not get re-requested. o Provides a sanity test to verify modprobe's work This is important given how any get_fs_type() users assert success means we're ready to go, and tests with the new test_kmod stress driver reveal that request_module() and get_fs_type() might fail for a few other reasons. You don't need old kmod to fail on request_module() or get_fs_type(), with the right system setup, these calls *can* fail today. Although this does get us in the business of keeping alias maps in kernel, the the work to support and maintain this is trivial. Aditionally, since it may be important get_fs_type() should not fail on certain systems, this tightens things up a bit more. The TL;DR: kmod <= v19 will return 0 on modprobe calls if you are built-in, however its heuristics for checking if you are built-in were broken. It assumed that having the directory /sys/module/module-name but not having the file /sys/module/module-name/initstate is sufficient to assume a module is built-in. The kernel loads the inittstate attribute *after* it creates the directory. This is an issue when modprobe returns 0 for kernel calls which assumes a return of 0 on request_module() can give you the right to assert the module is loaded and live. We cannot trust returns of modprobe as 0 in the kernel, we need to verify that modules are live if modprobe return 0 but only if modules *are* modules. The kernel heuristic we use to determine if a module is built-in is that if modprobe returns 0 we know we must be built-in or a module, but if we are a module clearly we must have a lingering kmod dangling on our linked list. If there is no modules there we are *somewhat* certain the module must be built in. This is not enough though... we cannot easily work around this since the kernel can use aliases to userspace for modules calls. For instance fs/namespace.c uses fs-modulename for filesystesms on get_fs_type(), so these need to be taken into consideration as well. Using kmod <= 19 will give you a NULL get_fs_type() return even though the module was loaded... That is a corner case, there are other failures for request_module() though -- the other failures are not easy to reproduce though but fortunately we have a stress test driver to help with that now. Use the following tests: # tools/testing/selftests/kmod/kmod.sh -t 0008 # tools/testing/selftests/kmod/kmod.sh -t 0009 You can more easily see this error if you have kmod <= v19 installed. You will need to install kmod <= v19, be sure to install its modprobe into /sbin/ as by default the 'make install' target does not replace your own. This test helps cure test_kmod cases 0008 0009 so enable them. Reported-by: Martin Wilck Reported-by: Randy Wright Signed-off-by: Luis R. Rodriguez --- kernel/kmod.c | 73 ++++++++++++++++++++++++++++++++++++ kernel/module.c | 11 ++++-- tools/testing/selftests/kmod/kmod.sh | 9 ++--- 3 files changed, 85 insertions(+), 8 deletions(-) diff --git a/kernel/kmod.c b/kernel/kmod.c index a0f449f77ed7..6bf0feab41d1 100644 --- a/kernel/kmod.c +++ b/kernel/kmod.c @@ -61,6 +61,11 @@ static DECLARE_RWSEM(umhelper_sem); #ifdef CONFIG_MODULES +bool finished_loading(const char *name); +int module_wait_until_finished(const char *name); +struct module *find_module_all(const char *name, size_t len, + bool even_unformed); + /* modprobe_path is set via /proc/sys. */ @@ -158,6 +163,72 @@ int get_kmod_umh_count(void) return atomic_read(&kmod_concurrent); } +static bool kmod_exists(char *name) +{ + struct module *mod; + + mutex_lock(&module_mutex); + mod = find_module_all(name, strlen(name), true); + mutex_unlock(&module_mutex); + + if (mod) + return true; + + return false; +} + +/* + * The assumption is this must be a module, it could still not be live though + * since kmod <= 19 returns 0 even if it was not ready yet. Allow for force + * wait check in case you are stuck on old userspace. + */ +static int wait_for_kmod(char *name) +{ + int ret = 0; + + if (!finished_loading(name)) + ret = module_wait_until_finished(name); + + return ret; +} + +/* + * kmod <= 19 will tell us modprobe returned 0 even if the module + * is not ready yet, it does this because it checks the /sys/module/mod-name + * directory and if its created but the /sys/module/mod-name/initstate is not + * created it assumes you have a built-in driver. At this point the module + * is still unformed, and telling the kernel at any point via request_module() + * will cause issues given a lot of places in the kernel assert that the driver + * will be present and ready. We need to account for this. + * + * If we had a module and even if buggy modprobe returned 0, we know we'd at + * least have a dangling kmod entry we could fetch. + * + * If modprobe returned 0 and we cannot find a kmod entry this is a good + * indicator your by userspace and kernel space that what you have is built-in. + * + * If modprobe returned 0 and we can find a kmod entry we should air on the + * side of caution and wait for the module to become ready or going. + * + * In the worst case, for built-in, we have to check on the module list for + * as many aliases possible the kernel gives the module, if that is n, that + * n traversals on the module list. + */ +static int finished_kmod_load(char *name) +{ + int ret = 0; + bool is_fs = (strlen(name) > 3) && (strncmp(name, "fs-", 3) == 0); + + if (kmod_exists(name)) { + ret = wait_for_kmod(name); + } else { + if (is_fs && kmod_exists(name + 3)) + ret = wait_for_kmod(name + 3); + } + + return ret; +} + /** * __request_module - try to load a kernel module * @wait: wait (or not) for the operation to complete @@ -211,6 +282,8 @@ int __request_module(bool wait, const char *fmt, ...) trace_module_request(module_name, wait, _RET_IP_); ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC); + if (!ret) + ret = finished_kmod_load(module_name); kmod_umh_threads_put(); return ret; diff --git a/kernel/module.c b/kernel/module.c index e420ed67e533..bf854321dca0 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -590,8 +590,8 @@ EXPORT_SYMBOL_GPL(find_symbol); * Search for module by name: must hold module_mutex (or preempt disabled * for read-only access). */ -static struct module *find_module_all(const char *name, size_t len, - bool even_unformed) +struct module *find_module_all(const char *name, size_t len, + bool even_unformed) { struct module *mod; @@ -3325,7 +3325,7 @@ static int post_relocation(struct module *mod, const struct load_info *info) } /* Is this module of this name done loading? No locks held. */ -static bool finished_loading(const char *name) +bool finished_loading(const char *name) { struct module *mod; bool ret; @@ -3486,6 +3486,11 @@ static int may_init_module(void) return 0; } +int module_wait_until_finished(const char *name) +{ + return wait_event_interruptible(module_wq, finished_loading(name)); +} + /* * We try to place it in the list now to make sure it's unique before * we dedicate too many resources. In particular, temporary percpu diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh index 9ea1864d8bae..ccf35b8d1671 100755 --- a/tools/testing/selftests/kmod/kmod.sh +++ b/tools/testing/selftests/kmod/kmod.sh @@ -382,7 +382,7 @@ kmod_test_0008() let EXTRA=$MODPROBE_LIMIT/2 config_num_thread_limit_extra $EXTRA config_trigger ${FUNCNAME[0]} - config_expect_result ${FUNCNAME[0]} -EINVAL + config_expect_result ${FUNCNAME[0]} SUCCESS } kmod_test_0009() @@ -392,7 +392,7 @@ kmod_test_0009() #let EXTRA=$MODPROBE_LIMIT/3 config_num_thread_limit_extra 5 config_trigger ${FUNCNAME[0]} - config_expect_result ${FUNCNAME[0]} -EINVAL + config_expect_result ${FUNCNAME[0]} SUCCESS } trap "test_finish" EXIT @@ -442,8 +442,7 @@ kmod_test_0004 kmod_test_0005 kmod_test_0006 kmod_test_0007 - -#kmod_test_0008 -#kmod_test_0009 +kmod_test_0008 +kmod_test_0009 exit 0 -- 2.10.1