Received: by 10.223.185.111 with SMTP id b44csp427173wrg; Fri, 9 Mar 2018 07:17:57 -0800 (PST) X-Google-Smtp-Source: AG47ELsIzj+sdKp0hRhr17FzS4S2nLdghFnfYLvPB0pIkXNcR6ZxRGOV07oArRJYD8MM837kxPMw X-Received: by 10.101.72.2 with SMTP id h2mr24747881pgs.240.1520608677105; Fri, 09 Mar 2018 07:17:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520608677; cv=none; d=google.com; s=arc-20160816; b=vC7oRzzwiR20hPUs7D2quaZf/EuBOpCuq9DhgXcNAGvKYQp+9nYGBYzux0tErOTYP1 Z7o7ZScOV0vmFZ/CxaR4UTUWL+VdPSJPH8PTef9a50fxbRRZPtokNHnjmxRUqCAwDANh 8qlgDuPqRPerxLUPYuf3ljKeLvYeISdNRDCzGPo2aShKpz2wnrhMpXgMmc2RMnp+gKeQ XVzNZIA3d5/hnzVtTs46fey5JIyw4QCeUxbvshWhArw0Hsj/e0m2cBmr7oetmruiSYiP pXVm4tEqiVWQSBJqCO/WuY1s1kMXK42VwPrkrWIb1Tm48/NiLJdLO7xE99YRFHZr+3hG o6hg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:references:in-reply-to:mime-version :dmarc-filter:arc-authentication-results; bh=VDrDKCUkwbKvT/i+sX2D2CJmSbZXSE/tXODyBgGNjYg=; b=i22atVNsMxjt5B7apUSD05UXJdHhFqXXxb6LNjkQME7EuWEMGcY46xLMPNjpg/GNAZ 7UW33gpeKCyROur1P+eZzUr5LtaMv6bt00cbD1i/mWxSu/lSO/DBRCOJEeTZTkqikoVu 50QB3dlj5+2uveD8GYAVrxWHgMz0/+S+0DIL8lF07m+r5CG8PgmvDUJKycBwYANehlGz UCiPbjjSWbxGi41toQXpqEt3+nW04N+VViNqvyUP7nD5EBZMLna6NJxBMu1Xl4zzuqbv tiJ+fxZZmVTDszLKqwQjYje7STo+BsDu+xU5ykRJyGehNZe2JCyZaIzZRdRbD1ufaBtC ihGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n27si999339pfg.102.2018.03.09.07.17.42; Fri, 09 Mar 2018 07:17:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932178AbeCIPQl convert rfc822-to-8bit (ORCPT + 99 others); Fri, 9 Mar 2018 10:16:41 -0500 Received: from mail.kernel.org ([198.145.29.99]:46686 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751096AbeCIPQj (ORCPT ); Fri, 9 Mar 2018 10:16:39 -0500 Received: from mail-it0-f49.google.com (mail-it0-f49.google.com [209.85.214.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A21AB2177C for ; Fri, 9 Mar 2018 15:16:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A21AB2177C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org Received: by mail-it0-f49.google.com with SMTP id k79-v6so3220212ita.2 for ; Fri, 09 Mar 2018 07:16:38 -0800 (PST) X-Gm-Message-State: AElRT7E/U4+Hv9oE0wMpb+nltT/AqzFykbdOusFbp4ysdGE0wAnYfJ5w aaOBTzFtWkVYiC1YDsBVfwI2BFNSFAuaal+vtLeo1g== X-Received: by 10.36.78.14 with SMTP id r14mr3899905ita.146.1520608597950; Fri, 09 Mar 2018 07:16:37 -0800 (PST) MIME-Version: 1.0 Received: by 10.2.137.101 with HTTP; Fri, 9 Mar 2018 07:16:17 -0800 (PST) In-Reply-To: <87478c51-59a7-f6ac-1fb2-f3ca2dcf658b@fb.com> References: <20180306013457.1955486-1-ast@kernel.org> <87478c51-59a7-f6ac-1fb2-f3ca2dcf658b@fb.com> From: Andy Lutomirski Date: Fri, 9 Mar 2018 15:16:17 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH net-next] modules: allow modprobe load regular elf binaries To: Alexei Starovoitov Cc: Linus Torvalds , Kees Cook , Alexei Starovoitov , Djalal Harouni , Al Viro , "David S. Miller" , Daniel Borkmann , Greg KH , "Luis R. Rodriguez" , Network Development , LKML , kernel-team , Linux API Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >> On Mar 8, 2018, at 9:08 PM, Alexei Starovoitov wrote: >> >> On 3/8/18 7:54 PM, Andy Lutomirski wrote: >> >> >> >>> On Mar 8, 2018, at 7:06 PM, Linus Torvalds wrote: >>> >>> >>> Honestly, that "read twice" thing may be what scuttles this. >>> Initially, I thought it was a non-issue, because anybody who controls >>> the module subdirectory enough to rewrite files would be in a position >>> to just execute the file itself directly instead. >> >> On further consideration, I think there’s another showstopper. This patch is a potentially severe ABI break. Right now, loading a module *copies* it into memory and does not hold a reference to the underlying fs. With the patch applied, all kinds of use cases can break in gnarly ways. Initramfs is maybe okay, but initrd may be screwed. If you load an ET_EXEC module from initrd, then umount it, then clear the ramdisk, something will go horribly wrong. Exactly what goes wrong depends on whether userspace notices that umount() failed. Similarly, if you load one of these modules over a network and then lose your connection, you have a problem. > > there is not abi breakage and file cannot disappear from running task. > One cannot umount fs while file is still being used. Sure it is. Without your patch, init_module doesn’t keep using the file, so it’s common practice to load a module and then delete or unmount it. With your patch, the unmount case breaks. This is likely to break existing userspace, so, in Linux speak it’s an ABI break. > >> >> The “read twice” thing is also bad for another reason: containers. Suppose I have a setup where a container can load a signed module blob. With the read twice code, the container can race and run an entirely different blob outside the container. > > Not only "read twice", but "read many". > If .text sections of elf that are not yet in memory can be modified > by malicious user, later they will be brought in with different code. > I think the easiest fix to tighten this "umh modules" to CAP_SYS_ADMIN. Given this issue, I think the patch would need Kees’s explicit ack. I had initially thought your patch had minimal security impact, but I was wrong Module security is very complicated and needs to satisfy a bunch of requirements. There is a lot of code in the kernel that assumes that it’s sufficient to verify a module once at load time, your patch changes that, and this has all kinds of nasty interactions with autoloading. Kees is very reasonable, and he’ll change his mind and ack a patch that he’s nacked when presented with a valid technical argument. But I think my ABI break observation is also a major problem, and Linus is going to be pissed if this thing lands in his tree and breaks systems due to an issue that was raised during review. So I think you need to either rework the patch or do a serious survey of how all the distros deal with modules (dracut, initramfs-tools, all the older stuff, and probably more) and make sure they can all handle your patch. I'd also be concerned about anyone who puts /lib/modules on less reliable storage than they use for the rest of their system. (I know it's quite common to have /boot be the only non-RAID partition on a system, but modules don't generally live in /boot.) Also, I think you really ought to explain how your approach will work with MODULES=n or convince Linus that it’s okay to start adding core networking features that don’t work with MODULES=n and can't be built into the main kernel image.