Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp135307ybj; Mon, 4 May 2020 17:44:30 -0700 (PDT) X-Google-Smtp-Source: APiQypLLtNmvSZduF9NQT6ebic1Q1TALXvdhH/kwLaqSDeOXh9Gj7MNNrVRb7QukNnWdkXfJuHAI X-Received: by 2002:a17:906:374b:: with SMTP id e11mr379296ejc.283.1588639470839; Mon, 04 May 2020 17:44:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588639470; cv=none; d=google.com; s=arc-20160816; b=pTyyvyzb0DWNsGVif9BxQzAT2xzS918OAEKxHZF4WQZ0mKpaed87+civyocP6TeMhE 4UHa/rZ44WrYmsAK+lvaZmvXRpfV9z+zzMsmZglIqeAw6O8ByFZksL96f7wPALhGPv/W gjwjyEIducuw4vueR7OxmGjbo86qJflq8eXrjAjNAXXiwHPKoS/55tlQ5N8j/Y2ZDRW2 LW4AatVXKiQujW0ohwfnRx1Lxs8oEms1HMN8mWXYwOPa+oKYkbJsJeJ2+4Okefb7RL0E YatQ2YN3qLUjbrpZlEaLvS1n8kMb1Q+tOeDPzdxwkVaVpDMHSzQ6YcZg2W5u9c4vJDfe rc0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=SC7McGP+GK4Dve/rTZItHh/RQn0gwcB/sFwPt4E0lGc=; b=GRxLAIt+hBYBMinAcX0GxtF7vLIl/UxGaGqJtA6dvX8pO85WXaC1lfEB1HGaMxzjXD wChu9qb5Rj+PBeWhJjH3G/6BcydH9Ge5VeOws8jYyy6ocvmlfv7Bmzm0PhG9zDJVA0ty ckFBa/3Xx31/btiBdquH8/n5yhMJEn35wBzk8sII07i8f5fhfwSAFcRMa3LsqKYjPexq aPRSO63ssOzxy1UD9891o6yF9uMKnFfRLVg8sFE5KFdt1ANECIjug2u1JM84QHvlSDsj FjsD4mx9J8/z24o/dbgFH8MBCohASlvBQXuEPsqp054yNth8/g4AMtbzo4qyuRmd/TIZ Wbfg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l23si140535ejc.135.2020.05.04.17.44.06; Mon, 04 May 2020 17:44:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726598AbgEEAoC (ORCPT + 99 others); Mon, 4 May 2020 20:44:02 -0400 Received: from mail105.syd.optusnet.com.au ([211.29.132.249]:34931 "EHLO mail105.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726550AbgEEAoC (ORCPT ); Mon, 4 May 2020 20:44:02 -0400 Received: from dread.disaster.area (pa49-195-157-175.pa.nsw.optusnet.com.au [49.195.157.175]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id EA7543A3908; Tue, 5 May 2020 10:43:59 +1000 (AEST) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1jVlgk-0000xg-LW; Tue, 05 May 2020 10:43:58 +1000 Date: Tue, 5 May 2020 10:43:58 +1000 From: Dave Chinner To: Jan Kara Cc: "Darrick J. Wong" , Francois , linux-ext4@vger.kernel.org Subject: Re: ext4 and project quotas bugs Message-ID: <20200505004358.GG2005@dread.disaster.area> References: <20200428153228.GB6426@quack2.suse.cz> <20200428155351.GH6733@magnolia> <20200428164824.GD6426@quack2.suse.cz> <20200429024201.GE2005@dread.disaster.area> <20200430111436.GD12716@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200430111436.GD12716@quack2.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=QIgWuTDL c=1 sm=1 tr=0 a=ONQRW0k9raierNYdzxQi9Q==:117 a=ONQRW0k9raierNYdzxQi9Q==:17 a=kj9zAlcOel0A:10 a=sTwFKg_x9MkA:10 a=7-415B0cAAAA:8 a=RKuqs710FA7j12_tIekA:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Thu, Apr 30, 2020 at 01:14:36PM +0200, Jan Kara wrote: > On Wed 29-04-20 12:42:01, Dave Chinner wrote: > > On Tue, Apr 28, 2020 at 06:48:24PM +0200, Jan Kara wrote: > > > On Tue 28-04-20 08:53:51, Darrick J. Wong wrote: > > > > On Tue, Apr 28, 2020 at 05:32:28PM +0200, Jan Kara wrote: > > > > > > dd if=/dev/zero of=someoutput oflag=append > > > > > > dd: failed to open 'someoutput': Invalid argument > > > > > > > > > > Yes, that's a bug that should be fixed. Thanks for reporting this! -1 means > > > > > 'this id is not expressible in current user namespace' and some code gets > > > > > confused along the way. We should refuse to set project -1 for a file... > > > > > > > > Awkward part: projid 4294967295 is allowed on XFS (at least by the > > > > kernel), though the xfs quota tools do not permit that. > > > > > > Are you OK with just refusing to set projid 4294967295 for everybody? Or > > > should we just not try to translate project IDs through user namespaces? > > > Because XFS does not seem to translate them while ext4 does... What a mess. > > > > We do not translate project IDs through user names space because > > they are not usable as a mappable id. Project IDs are only used for > > customised aggregation of space accounting, unlike UIDs and GIDS > > that are used primarily for access control. IOWs, PRIDs are > > fundamentally different to UIDs and GIDs. > > > > Project IDs were already being used in the init namespace for > > directory quotas to limit containers using bind mounts on a host > > filesystem to an amount of disk space less than the entire hosting > > filesystem. And once you use PRIDs in the init namespace, they > > cannot be used by users in other user namespaces, regardless of > > whether they are mappable or not. > > OK, understood. > > > Essentially, the project ID mapping stuff was implemented by someone > > who didn't understand what project IDs were or how project IDs were > > being used, and then refused to listen to the people who knew these > > things and wanted them to drop the PRID mapping stuff. And then > > Linus pulled their tree containing all the uid/gid/prid mapping code > > without warning and we've been stuck with this shit ever since. > > > > Hence in XFS we simply do not allow project IDs to be manipulated > > outside of the init user namespace, and so mapping them is > > irrelevant because users in confined namespaces cannot usefully > > interact with them in any way. > > So in ext4 we also don't currently allow anybody outside init user > namespace to change project IDs. Also as I'm now checking the projid > handling in ext4 more closely, we always transform project ID only to/from > init_user_ns (even in FSGETXATTR ioctl) so it's more or less pointless and > equivalent to XFS not transforming anything AFAIU. *nod* > So the only problem is really with VFS quota code. There we do mapping of > passed project ID from current_user_ns() in fs/quota/quota.c before passing > the ID further to the core quota code. Practically, this is only relevant > for GETQUOTA quotactl calls because all the others are restricted to > init_user_ns capable CAP_SYS_ADMIN so they can get called only from > init_user_ns. > > Now we also have a check like: > > /* Filesystems outside of init_user_ns not yet supported */ > if (sb->s_user_ns != &init_user_ns) { > error = -EINVAL; > goto out_fmt; > } > > in dquot_load_quota_sb() which is the quota enabling function. So we don't > allow any quotas for filesystems outside of init_user_ns. So the > qid_has_mapping() checks are mostly pointless as sb->s_user_ns is always > init_user_ns. But this is except for id -1, which doesn't have mapping even > in init_user_ns... ISTR that was done because it was supposed to be the "invalid ID" indicator, and so it common across everything? Kinda like the "nobody" UID? [ Apart from the fact that older XFS filesystems only support 16 bit project IDs, so using 2^32-1 for anything is kinda troublesome. ] > So I'm pondering what's the best way out of this mess. Currently, the > mapping of project IDs in quota code has rather limited impact and we may > be able to get away with just removing it (i.e. without causing a > regression for any real user). So that's certainly one option. But then we > should probably also remove the capability to specify (non-trivial) project > ID maps for user namespaces because having maps that are not actually > applied is pretty confusing. *nod* > Then there's a second option: Is there a reason *not* to map project IDs > in user namespaces? I understand it's pointless with how project ids are > currently used but it does not harm either AFAIU. The only real harm is > with id -1 not being usable. Also when people create fs mount option where > project ID is changeable by CAP_SYS_ADMIN (or maybe CAP_SYS_RESOURCE) > capable user - and there are several people asking for a functionality like > this - then fully mapping project IDs would IMHO make more sence. I'm not opposed to doing this, however I have not had anyone at all ask for this functionality at all. SO perhaps it would be better to start with describing the use cases and user requirements so can get a better idea of the applications that people want to use mappable prids for... Cheers, Dave. -- Dave Chinner david@fromorbit.com