Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp3309530imm; Sun, 30 Sep 2018 17:25:58 -0700 (PDT) X-Google-Smtp-Source: ACcGV62Ozk3Q+xrFJLIXgg3iH9kG47avL389d71SaLCgL7IAajVyooBx2jcAX45HRaP6S1/P0nE9 X-Received: by 2002:a65:43cd:: with SMTP id n13-v6mr2503553pgp.185.1538353558306; Sun, 30 Sep 2018 17:25:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538353558; cv=none; d=google.com; s=arc-20160816; b=w8kdSn1a3EFxGJJVBE6b2S18Dme7bdyEu6vPylv7gDmiDIPHc9qa8YG60F90U5PFUS KPR2u5L++xkmOCJtKRwSEMdFBAhGHudjdwMS0qWvQPY7r+a2kLo70+7ifSEnOtHBsXRs NMjLAnUcyoZwX4Xpd8m1QFwrJH+FiFTjeZwwl+LQMDDCJo7/uHUPCEEIx1WdHKrKP5Um Bhie8dT6O0bZtDOoQMIjQJpImY4RMkJ2llFKAuZEhH5GJZ6CdjUJ1bq+kO/YdTY+UjsH gUX3T+eIq/I29pE42V0sKEIltmYmTl1mbnIe+qF7+fJVMlkMstG4Q4dJGgx+9F3BhET6 lqVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=6Mehr4mTvpQlrMawXGNwTAmCWVwWHe6nlQMf+vYtZ7E=; b=Te4Nb1649m4O1CRk9HwUbuItO8BejJxP2Y1x3Lx7gqUmuI7zRmoybGt7iCOuf24vMU pxjaPl19DY7wNAQlfQ9dbxo9iypxXN4/Kx8e8ffjfkIED7R7nsi3jGU8QVTFXxSyevdY oRDnppGu2v4znMrEOe5YRfZ/laIbufXa/8acQt5hln4guRxd++op+qdN30/g86z/XzkI 4zfttQyzwNSNZFtttN/vcR7Do2qQC33Duw7AqbJhrAXFP4Cgwv1PqrR2nP7lDud+W26E vW/zNmc/WmQSBOFNiToeARg1kxBv/M70XJA37gsjc+5xiMSq5Owi7k6tR8dAheGnme88 x5lA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 9-v6si12425316plf.345.2018.09.30.17.25.31; Sun, 30 Sep 2018 17:25:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726604AbeJAHAa (ORCPT + 99 others); Mon, 1 Oct 2018 03:00:30 -0400 Received: from ipmail01.adl6.internode.on.net ([150.101.137.136]:32715 "EHLO ipmail01.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726245AbeJAHA3 (ORCPT ); Mon, 1 Oct 2018 03:00:29 -0400 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail01.adl6.internode.on.net with ESMTP; 01 Oct 2018 09:55:21 +0930 Received: from dave by dastard with local (Exim 4.80) (envelope-from ) id 1g6m1Z-0005G7-7g; Mon, 01 Oct 2018 10:25:21 +1000 Date: Mon, 1 Oct 2018 10:25:21 +1000 From: Dave Chinner To: Alan Cox Cc: TongZhang , darrick.wong@oracle.com, linux-xfs@vger.kernel.org, LKML , linux-security-module@vger.kernel.org, Wenbo Shen Subject: Re: Leaking Path in XFS's ioctl interface(missing LSM check) Message-ID: <20181001002521.GM31060@dastard> References: <5EF0D46A-C098-4B51-AD13-225FFCA35D4C@vt.edu> <20180926013329.GD31060@dastard> <20180926192426.472360ea@alans-desktop> <20180927013812.GF31060@dastard> <20180930151652.6975610c@alans-desktop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180930151652.6975610c@alans-desktop> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Sep 30, 2018 at 03:16:52PM +0100, Alan Cox wrote: > > > CAP_SYS_ADMIN is also a bit weird because low level access usually > > > implies you can bypass access controls so you should also check > > > CAP_SYS_DAC ? > > > > Do you mean CAP_DAC_READ_SEARCH as per the newer handle syscalls? > > But that only allows bypassing directory search operations, so maybe > > you mean CAP_DAC_OVERRIDE? > > It depends what the ioctl allows you to do. If it allows me to bypass > DAC and manipulate the file system to move objects around then it's a > serious issue. These interfaces have always been allowed to do that. You can't do transparent online background defragmentation without bypassing DAC and moving objects around. You can't scrub metadata and data without bypassing DAC. You can't do dedupe without bypassing /some level/ of DAC to get access to the filesystem used space map and the raw block device to hash the data. But the really important access control for dedupe - avoiding deduping data across files at different security levels - isn't controlled at all. > The underlying problem is if CAP_SYS_ADMIN is able to move objects around > then I can move modules around. Yup, anything with direct access to block devices can do that. Many filesystem and storage utilities are given direct access to the block device, because that's what they need to work. e.g. in DM land, the control ioctls (ctl_ioctl()) are protected by: /* only root can play with this */ if (!capable(CAP_SYS_ADMIN)) return -EACCES; Think about it - if DM control ioctls only require CAP_SYS_ADMIN, then if have that cap you can use DM to remap any block in a block device to any other block. You don't need to the filesystem to move stuff around, it can be moved around without the filesystem knowing anything about it. > We already have a problem with > CAP_DAC_OVERRIDE giving you CAP_SYS_RAWIO (ie totally owning the machine) > unless the modules are signed, if xfs allows ADMIN as well then > CAP_SYS_ADMIN is much easier to obtain and you'd get total system > ownership from it. Always been the case, and it's not isolated to XFS. $ git grep CAP_SYS_ADMIN fs/ |wc -l 139 $ git grep CAP_SYS_ADMIN block/ |wc -l 16 $ git grep CAP_SYS_ADMIN drivers/block/ drivers/scsi |wc -l 88 The "CAP_SYS_ADMIN for ioctls" trust model in the storage stack extends both above and below the filesystem. If you don't trust CAP_SYS_ADMIN, then you are basically saying that you cannot trust your storage management and maintenance utilities at any level. > Not good. > > > Regardless, this horse bolted long before those syscalls were > > introduced. The time to address this issue was when XFS was merged > > into linux all those years ago, back when the apps that run in > > highly secure restricted environments that use these interfaces were > > being ported to linux. We can't change this now without breaking > > userspace.... > > That's what people said about setuid shell scripts. Completely different. setuid shell scripts got abused as a hack for the lazy to avoid setting up permissions properly and hence were easily exploited. The storage stack is completely dependent on a simplisitic layered trust model and that root (CAP_SYS_ADMIN) is god. The storage trust model falls completely apart if we don't have a trusted root user to administer all layers of the storage stack. This isn't the first time I've raised this issue - I raised it back when the user namespace stuff was ram-roaded into the kernel, and was essentially ignored by the userns people. As a result, we end up with all the storage management ioctls restricted to the initns where we have trusted CAP_SYS_ADMIN users. I've also raised it more recently in the unprivileged mount discussions (so untrusted root in containers can mount filesystems) - no solution to the underlying trust model deficiencies was found in those discussions, either. Instead, filesystems that can be mounted by untrusted users (i.e. FUSE) have a special flag in their fstype definition to say this is allowed. Systems restricted by LSMs to the point where CAP_SYS_ADMIN is not trusted have exactly the same issues. i.e. there's nobody trusted by the kernel to administer the storage stack, and nobody has defined a workable security model that can prevent untrusted users from violating the existing storage trust model.... Cheers, Dave. -- Dave Chinner david@fromorbit.com