Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752731AbdLGTvp (ORCPT ); Thu, 7 Dec 2017 14:51:45 -0500 Received: from mx2.suse.de ([195.135.220.15]:42799 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752525AbdLGTvo (ORCPT ); Thu, 7 Dec 2017 14:51:44 -0500 Date: Thu, 7 Dec 2017 20:51:40 +0100 From: "Luis R. Rodriguez" To: Djalal Harouni Cc: "Luis R. Rodriguez" , Jessica Yu , Jessica Yu , Rusty Russell , Kees Cook , mbenes@suse.cz, atomlin@redhat.com, Petr Mladek , hare@suse.com, James Morris , "Eric W. Biederman" , "David S . Miller" , Andrew Morton , Linus Torvalds , linux-kernel Subject: Re: module: add debugging alias parsing support Message-ID: <20171207195140.GV729@wotan.suse.de> References: <20171130023605.29568-1-mcgrof@kernel.org> <20171130023605.29568-4-mcgrof@kernel.org> <20171130131710.ccccf4alzrnvmlp3@redbean> <20171130183947.GI729@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3286 Lines: 60 On Mon, Dec 04, 2017 at 10:01:02AM +0100, Djalal Harouni wrote: > Luis in your commit log you say: > > "Obviously userspace can be buggy though, and it can lie to us. We > currently have no easy way to determine this." > > Could you please share some info here ? how userspace can be buggy ? Sure, so kmod <= v19 was broken -- it returns 0 to modprobe calls, assuming the kernel module is built-in, where really we have a race as the module starts forming. kmod <= v19 has incorrect userspace heuristics, a userspace kmod fix is available for it: http://git.kernel.org/cgit/utils/kernel/kmod/kmod.git/commit/libkmod/libkmod-module.c?id=fd44a98ae2eb5eb32161088954ab21e58e19dfc4 Although this is fixed now, in theory userspace can regress again, but debugging these sorts of issue is not so easy at all. Technically, with aliases parsed (as this patch series proposed it's just debugging data), it should be possible for request_module() to also issue a finished_kmod_load() so that when request_module() completes and returns 0 if we know for certain the module is loaded. This could be useful either debugging purposes, for instance to catch when userspace lies once again, or if we do find value in it, to make a pedantic and patient new request_module_load(), which for instance would want to just wait for the module to really be loaded. What I mean is that some request_module() calls uses *assume* that if it returns 0 that the module is loaded, and this is incorrect. Each request_module() users today *should* in theory vet and ensure each module is loaded correctly. Since there is no generic form to do this today, try_then_request_module() was added to help with that. try_then_request_module() solves two issues really, one is it prevents a request to userspace if the module is already loaded, and two, it double checks again if the module is loaded at the end using whatever heuristic the caller wishes. AFAICT its subjective whether or not each caller should open code its own try_then_request_module() form, or if we should have a generic validator, as such even though I have some dev code to use aliases to do something like finished_kmod_load(), its just debug private code I use right now, and AFAICT we have no formal justification for a change. Perhaps one could argue that having a generic validator is much easier than expecting all callers to use it correctly, but last time I checked I found only one buggy users of request_module() and I fixed it (see 41124db869b ("fs: warn in case userspace lied about modprobe return") so the point was moot. You might think that expanding uses of aliases might benefit from a validator, but the only thing I can think of right now is that avoiding to call userspace if the module is already loaded, but this can *already* be done by using try_then_request_module() or open coding your own checker as get_fs_type() does, its just none of this is in a very generic form. If you do come up with a valid argument for a generic form or for a full generic wait as you expand use of aliases using request_module() do let me know and we can revisit separately. In the meantime I just noticed I do have a valid small optimization in my dev code which does make sense upstream which I'll send in for review. Luis