2020-04-28 15:33:09

by Jan Kara

[permalink] [raw]
Subject: Re: ext4 and project quotas bugs

Hello!

On Tue 28-04-20 08:41:59, Francois wrote:
> hello! I was just giving ext4 project quotas a try. Definitely not the
> most used ext4 feature (I was the first to answer a stackoverflow
> question on it in https://stackoverflow.com/a/61465057/2852464).
> Quotacheck tells me to mail you the bugs :D so I do

Sure :) Generally for ext4 issues (including project quotas) you can also
ask at [email protected] (added to CC so that the discussion is
archived for other people).

> my goal is to make some kind of ansible playbook to install project
> quotas, so I am interested in using a tool like setquota, I also want
> the teams behind the capped directories, to think about a clean-up
> mechanism (the quota would just be a temporary annoyance for them), so
> it should not be "jailbreakable" too easily.

Hum, that "not jailbreakable" part is going to be difficult unless you also
confine those users also in their user namespace. Because any user is
allowed to change project ID of the files he owns arbitrarily if he is
running in the initial user namespace. Project quotas have been designed as
an advisory feature back in Irix days... There are talks of allowing to
tweak the behavior (i.e., to allow setting of project id only by sysadmin)
by a mount option but so far nobody has implemented it.

> quota-4.05-3.1.x86_64
> Linux localhost 5.6.2-1-default #1 SMP Thu Apr 2 06:31:32 UTC 2020
> (c8170d6) x86_64 x86_64 x86_64 GNU/Linux
> CPE_NAME="cpe:/o:opensuse:tumbleweed:20200413"
>
> 1- quotacheck fails with quotacheck: Cannot find filesystem to check
> or filesystem not mounted with quota option.
> prjquota is enabled using extended mount options but quotacheck seems
> to ignore this
> # tune2fs -l /dev/loop0 | grep -i mount\ opt
> Default mount options: user_xattr acl
> Mount options: prjquota
>
> (also, shouldn't these mount options be reflected in /proc/mounts?)

Yes and that's deliberate. Unlike user and group quotas, project quotas are
only supported when stored in hidden system files (user and group quotas
are also supported in that way when you create ext4 with 'quota' feature).
So checking of quotas is handled by e2fsck and quotacheck has no way to
influence them.

> 2- project quota are a bit too easy to escape:
> dd if=/dev/zero of=someoutput oflag=append
> loop0: write failed, project block limit reached.
> dd: writing to 'someoutput': Disk quota exceeded
> 2467+0 records in
> 2466+0 records out
> 1262592 bytes (1.3 MB, 1.2 MiB) copied, 0.0105432 s, 120 MB/s
> [email protected]:/mnt/loop/abc/mydir3> chattr -p 33 someoutput
> [email protected]:/mnt/loop/abc/mydir3> dd if=/dev/zero of=someoutput
> oflag=append
> dd: writing to 'someoutput': No space left on device
> 127393+0 records in
> 127392+0 records out
> 65224704 bytes (65 MB, 62 MiB) copied, 0.568859 s, 115 MB/s

Yes and as I mentioned above this is deliberate.

> 3- project id '-1" yields fun results:
>
> chattr +P -p -1 .
> dd if=/dev/zero of=someoutput oflag=append
> dd: failed to open 'someoutput': Invalid argument

Yes, that's a bug that should be fixed. Thanks for reporting this! -1 means
'this id is not expressible in current user namespace' and some code gets
confused along the way. We should refuse to set project -1 for a file...

> 4- setquota fails but return code is zero
> > /usr/sbin/setquota -P 1 2 3 4 5 /dev/loop0 && echo success!
> setquota: Cannot get quota for project 1 from kernel on /dev/loop0:
> Operation not permitted
> setquota: error while getting quota from /dev/loop0 for #1 (id 1):
> Operation not permitted
> success!

OK, so you are unpriviledged user here, aren't you? So the failure is
expected, just the return code is wrong. Do I understand your complaint
correctly? I'm not able to reproduce that error:

[email protected]:/root> /root/source/quota-tools/setquota -P 1 2 3 4 5 /dev/vdb1 && echo success
bash: /root/source/quota-tools/setquota: Permission denied
[email protected]:/root>

Or what exactly are you testing?

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR


2020-04-28 15:55:45

by Darrick J. Wong

[permalink] [raw]
Subject: Re: ext4 and project quotas bugs

On Tue, Apr 28, 2020 at 05:32:28PM +0200, Jan Kara wrote:
> Hello!
>
> On Tue 28-04-20 08:41:59, Francois wrote:
> > hello! I was just giving ext4 project quotas a try. Definitely not the
> > most used ext4 feature (I was the first to answer a stackoverflow
> > question on it in https://stackoverflow.com/a/61465057/2852464).
> > Quotacheck tells me to mail you the bugs :D so I do
>
> Sure :) Generally for ext4 issues (including project quotas) you can also
> ask at [email protected] (added to CC so that the discussion is
> archived for other people).
>
> > my goal is to make some kind of ansible playbook to install project
> > quotas, so I am interested in using a tool like setquota, I also want
> > the teams behind the capped directories, to think about a clean-up
> > mechanism (the quota would just be a temporary annoyance for them), so
> > it should not be "jailbreakable" too easily.
>
> Hum, that "not jailbreakable" part is going to be difficult unless you also
> confine those users also in their user namespace. Because any user is
> allowed to change project ID of the files he owns arbitrarily if he is
> running in the initial user namespace. Project quotas have been designed as
> an advisory feature back in Irix days... There are talks of allowing to
> tweak the behavior (i.e., to allow setting of project id only by sysadmin)
> by a mount option but so far nobody has implemented it.
>
> > quota-4.05-3.1.x86_64
> > Linux localhost 5.6.2-1-default #1 SMP Thu Apr 2 06:31:32 UTC 2020
> > (c8170d6) x86_64 x86_64 x86_64 GNU/Linux
> > CPE_NAME="cpe:/o:opensuse:tumbleweed:20200413"
> >
> > 1- quotacheck fails with quotacheck: Cannot find filesystem to check
> > or filesystem not mounted with quota option.
> > prjquota is enabled using extended mount options but quotacheck seems
> > to ignore this
> > # tune2fs -l /dev/loop0 | grep -i mount\ opt
> > Default mount options: user_xattr acl
> > Mount options: prjquota
> >
> > (also, shouldn't these mount options be reflected in /proc/mounts?)
>
> Yes and that's deliberate. Unlike user and group quotas, project quotas are
> only supported when stored in hidden system files (user and group quotas
> are also supported in that way when you create ext4 with 'quota' feature).
> So checking of quotas is handled by e2fsck and quotacheck has no way to
> influence them.

How /does/ one enable whatever the latest iteration on quota is in ext4?
IIRC the old VFS method (independent non-journalled quota files in the
root dir) is deprecated, which means that the preferred method now is:

# mkfs.ext4 -O quota -E quotatype=usrquota:grpquota:prjquota /dev/fd0

right?

> > 2- project quota are a bit too easy to escape:
> > dd if=/dev/zero of=someoutput oflag=append
> > loop0: write failed, project block limit reached.
> > dd: writing to 'someoutput': Disk quota exceeded

EDQUOT? Hrm, XFS usually returns ENOSPC for project quotas, since we
also change the statfs output to make it look like the the mount size is
the project quota's hard limit.

> > 2467+0 records in
> > 2466+0 records out
> > 1262592 bytes (1.3 MB, 1.2 MiB) copied, 0.0105432 s, 120 MB/s
> > [email protected]:/mnt/loop/abc/mydir3> chattr -p 33 someoutput
> > [email protected]:/mnt/loop/abc/mydir3> dd if=/dev/zero of=someoutput
> > oflag=append
> > dd: writing to 'someoutput': No space left on device
> > 127393+0 records in
> > 127392+0 records out
> > 65224704 bytes (65 MB, 62 MiB) copied, 0.568859 s, 115 MB/s
>
> Yes and as I mentioned above this is deliberate.
>
> > 3- project id '-1" yields fun results:
> >
> > chattr +P -p -1 .

Heh, that command doesn't work on xfs. Weird, more kernel bugs to chase...

> > dd if=/dev/zero of=someoutput oflag=append
> > dd: failed to open 'someoutput': Invalid argument
>
> Yes, that's a bug that should be fixed. Thanks for reporting this! -1 means
> 'this id is not expressible in current user namespace' and some code gets
> confused along the way. We should refuse to set project -1 for a file...

Awkward part: projid 4294967295 is allowed on XFS (at least by the
kernel), though the xfs quota tools do not permit that.

--D

> > 4- setquota fails but return code is zero
> > > /usr/sbin/setquota -P 1 2 3 4 5 /dev/loop0 && echo success!
> > setquota: Cannot get quota for project 1 from kernel on /dev/loop0:
> > Operation not permitted
> > setquota: error while getting quota from /dev/loop0 for #1 (id 1):
> > Operation not permitted
> > success!
>
> OK, so you are unpriviledged user here, aren't you? So the failure is
> expected, just the return code is wrong. Do I understand your complaint
> correctly? I'm not able to reproduce that error:
>
> [email protected]:/root> /root/source/quota-tools/setquota -P 1 2 3 4 5 /dev/vdb1 && echo success
> bash: /root/source/quota-tools/setquota: Permission denied
> [email protected]:/root>
>
> Or what exactly are you testing?
>
> Honza
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR

2020-04-28 16:48:58

by Jan Kara

[permalink] [raw]
Subject: Re: ext4 and project quotas bugs

On Tue 28-04-20 08:53:51, Darrick J. Wong wrote:
> On Tue, Apr 28, 2020 at 05:32:28PM +0200, Jan Kara wrote:
> > Hello!
> >
> > On Tue 28-04-20 08:41:59, Francois wrote:
> > > hello! I was just giving ext4 project quotas a try. Definitely not the
> > > most used ext4 feature (I was the first to answer a stackoverflow
> > > question on it in https://stackoverflow.com/a/61465057/2852464).
> > > Quotacheck tells me to mail you the bugs :D so I do
> >
> > Sure :) Generally for ext4 issues (including project quotas) you can also
> > ask at [email protected] (added to CC so that the discussion is
> > archived for other people).
> >
> > > my goal is to make some kind of ansible playbook to install project
> > > quotas, so I am interested in using a tool like setquota, I also want
> > > the teams behind the capped directories, to think about a clean-up
> > > mechanism (the quota would just be a temporary annoyance for them), so
> > > it should not be "jailbreakable" too easily.
> >
> > Hum, that "not jailbreakable" part is going to be difficult unless you also
> > confine those users also in their user namespace. Because any user is
> > allowed to change project ID of the files he owns arbitrarily if he is
> > running in the initial user namespace. Project quotas have been designed as
> > an advisory feature back in Irix days... There are talks of allowing to
> > tweak the behavior (i.e., to allow setting of project id only by sysadmin)
> > by a mount option but so far nobody has implemented it.
> >
> > > quota-4.05-3.1.x86_64
> > > Linux localhost 5.6.2-1-default #1 SMP Thu Apr 2 06:31:32 UTC 2020
> > > (c8170d6) x86_64 x86_64 x86_64 GNU/Linux
> > > CPE_NAME="cpe:/o:opensuse:tumbleweed:20200413"
> > >
> > > 1- quotacheck fails with quotacheck: Cannot find filesystem to check
> > > or filesystem not mounted with quota option.
> > > prjquota is enabled using extended mount options but quotacheck seems
> > > to ignore this
> > > # tune2fs -l /dev/loop0 | grep -i mount\ opt
> > > Default mount options: user_xattr acl
> > > Mount options: prjquota
> > >
> > > (also, shouldn't these mount options be reflected in /proc/mounts?)
> >
> > Yes and that's deliberate. Unlike user and group quotas, project quotas are
> > only supported when stored in hidden system files (user and group quotas
> > are also supported in that way when you create ext4 with 'quota' feature).
> > So checking of quotas is handled by e2fsck and quotacheck has no way to
> > influence them.
>
> How /does/ one enable whatever the latest iteration on quota is in ext4?
> IIRC the old VFS method (independent non-journalled quota files in the
> root dir) is deprecated, which means that the preferred method now is:
>
> # mkfs.ext4 -O quota -E quotatype=usrquota:grpquota:prjquota /dev/fd0
>
> right?

Yes.

> > > 2- project quota are a bit too easy to escape:
> > > dd if=/dev/zero of=someoutput oflag=append
> > > loop0: write failed, project block limit reached.
> > > dd: writing to 'someoutput': Disk quota exceeded
>
> EDQUOT? Hrm, XFS usually returns ENOSPC for project quotas, since we
> also change the statfs output to make it look like the the mount size is
> the project quota's hard limit.

Yeah, we don't specialcase project quotas (just another quota type) in the
fs/quota/ and so errors are the same for all of them...

> > > 2467+0 records in
> > > 2466+0 records out
> > > 1262592 bytes (1.3 MB, 1.2 MiB) copied, 0.0105432 s, 120 MB/s
> > > [email protected]:/mnt/loop/abc/mydir3> chattr -p 33 someoutput
> > > [email protected]:/mnt/loop/abc/mydir3> dd if=/dev/zero of=someoutput
> > > oflag=append
> > > dd: writing to 'someoutput': No space left on device
> > > 127393+0 records in
> > > 127392+0 records out
> > > 65224704 bytes (65 MB, 62 MiB) copied, 0.568859 s, 115 MB/s
> >
> > Yes and as I mentioned above this is deliberate.
> >
> > > 3- project id '-1" yields fun results:
> > >
> > > chattr +P -p -1 .
>
> Heh, that command doesn't work on xfs. Weird, more kernel bugs to chase...

Yeah, unlike ext4 xfs refuses to set PROJINHERIT flag through SETFLAGS
ioctl which is what makes this command fail. But I don't see anything in
the handling of XFS_IOC_FSSETXATTR that would prevent setting of this
invalid project id...

Hum, after some debugging what I believe is failing is dquot_init() call
which tries to initialize project quotas for a new inode, calls
qid_has_mapping() from dqget() for this -1 project ID, gets error back
(ultimately from map_id_up()) and so dget() fails with -EINVAL which gets
propagated out to userspace.

> > > dd if=/dev/zero of=someoutput oflag=append
> > > dd: failed to open 'someoutput': Invalid argument
> >
> > Yes, that's a bug that should be fixed. Thanks for reporting this! -1 means
> > 'this id is not expressible in current user namespace' and some code gets
> > confused along the way. We should refuse to set project -1 for a file...
>
> Awkward part: projid 4294967295 is allowed on XFS (at least by the
> kernel), though the xfs quota tools do not permit that.

Are you OK with just refusing to set projid 4294967295 for everybody? Or
should we just not try to translate project IDs through user namespaces?
Because XFS does not seem to translate them while ext4 does... What a mess.

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR

2020-04-29 02:42:36

by Dave Chinner

[permalink] [raw]
Subject: Re: ext4 and project quotas bugs

On Tue, Apr 28, 2020 at 06:48:24PM +0200, Jan Kara wrote:
> On Tue 28-04-20 08:53:51, Darrick J. Wong wrote:
> > On Tue, Apr 28, 2020 at 05:32:28PM +0200, Jan Kara wrote:
> > > > dd if=/dev/zero of=someoutput oflag=append
> > > > dd: failed to open 'someoutput': Invalid argument
> > >
> > > Yes, that's a bug that should be fixed. Thanks for reporting this! -1 means
> > > 'this id is not expressible in current user namespace' and some code gets
> > > confused along the way. We should refuse to set project -1 for a file...
> >
> > Awkward part: projid 4294967295 is allowed on XFS (at least by the
> > kernel), though the xfs quota tools do not permit that.
>
> Are you OK with just refusing to set projid 4294967295 for everybody? Or
> should we just not try to translate project IDs through user namespaces?
> Because XFS does not seem to translate them while ext4 does... What a mess.

We do not translate project IDs through user names space because
they are not usable as a mappable id. Project IDs are only used for
customised aggregation of space accounting, unlike UIDs and GIDS
that are used primarily for access control. IOWs, PRIDs are
fundamentally different to UIDs and GIDs.

Project IDs were already being used in the init namespace for
directory quotas to limit containers using bind mounts on a host
filesystem to an amount of disk space less than the entire hosting
filesystem. And once you use PRIDs in the init namespace, they
cannot be used by users in other user namespaces, regardless of
whether they are mappable or not.

Essentially, the project ID mapping stuff was implemented by someone
who didn't understand what project IDs were or how project IDs were
being used, and then refused to listen to the people who knew these
things and wanted them to drop the PRID mapping stuff. And then
Linus pulled their tree containing all the uid/gid/prid mapping code
without warning and we've been stuck with this shit ever since.

Hence in XFS we simply do not allow project IDs to be manipulated
outside of the init user namespace, and so mapping them is
irrelevant because users in confined namespaces cannot usefully
interact with them in any way.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2020-04-29 03:34:52

by Andreas Dilger

[permalink] [raw]
Subject: Re: ext4 and project quotas bugs

On Apr 28, 2020, at 9:32 AM, Jan Kara <[email protected]> wrote:
>
> Hello!
>
> On Tue 28-04-20 08:41:59, Francois wrote:
>> my goal is to make some kind of ansible playbook to install project
>> quotas, so I am interested in using a tool like setquota, I also want
>> the teams behind the capped directories, to think about a clean-up
>> mechanism (the quota would just be a temporary annoyance for them), so
>> it should not be "jailbreakable" too easily.
>
> Hum, that "not jailbreakable" part is going to be difficult unless you also
> confine those users also in their user namespace. Because any user is
> allowed to change project ID of the files he owns arbitrarily if he is
> running in the initial user namespace. Project quotas have been designed as
> an advisory feature back in Irix days... There are talks of allowing to
> tweak the behavior (i.e., to allow setting of project id only by sysadmin)
> by a mount option but so far nobody has implemented it.

We tried to implement this for ext4, but Dave Chinner argued that
allowing anyone (at least in the root namespace) to set the project
ID to anything they wanted was part of how project quotas are
_supposed_ to work.

We ended up adding a restriction at the Lustre level, defaulting to
only allow root (chprojid_gid=0, via CAP_SYS_RESOURCE), or admins in
a specific numeric group (with chprojid_gid=N), to change the projid,
and denying regular users the ability to change the projid of files.

This can be changed by setting "chprojid_gid=-1" to allow users in
any group to change the projid of files, returning the XFS behavior.
The "chprojid_gid" is essentially a sysfs tunable for Lustre, but it
could also/instead be a mount option for ext4, if that is preferred.
I don't have a particular attachment to the parameter name, or how
it is set by the admin, but I think something like this is needed.


>> 2- project quota are a bit too easy to escape:
>> dd if=/dev/zero of=someoutput oflag=append
>> loop0: write failed, project block limit reached.
>> dd: writing to 'someoutput': Disk quota exceeded
>> 2467+0 records in
>> 2466+0 records out
>> 1262592 bytes (1.3 MB, 1.2 MiB) copied, 0.0105432 s, 120 MB/s
>> [email protected]:/mnt/loop/abc/mydir3> chattr -p 33 someoutput
>> [email protected]:/mnt/loop/abc/mydir3> dd if=/dev/zero of=someoutput
>> oflag=append
>> dd: writing to 'someoutput': No space left on device
>> 127393+0 records in
>> 127392+0 records out
>> 65224704 bytes (65 MB, 62 MiB) copied, 0.568859 s, 115 MB/s
>
> Yes and as I mentioned above this is deliberate.

That may be the historical XFS behavior, but IMHO, it doesn't make
this behavior *useful*. If *anyone* can change the projid of files
that makes them mostly useless. They might be OK for informational
or accounting purposes (e.g. fast "du" of a directory) in a friendly
user environment, but they are useless for any space management (i.e.
anyone can easily bypass project limits by "chattr -p $RANDOM <file>").

I'd prefer to make the project quotas useful out of the box for ext4,
by implementing the chprojid_gid tunable, or something equivalent.
If there are users/sites that want identical behavior to XFS, they
can always set chprojid_gid=-1 to allow anyone to change the projid.

I'd be happy to understand what Dave doesn't like about this proposal,
but the last time the enforcement of project quotas was discussed, my
attempt to figure this out ended with silence, see thread ending at:

https://lore.kernel.org/linux-ext4/[email protected]/

Maybe this time we can get over the hump? Is it just some implicit
difference between "directory quota" and "project quota" that exists
in XFS that I (and everyone using ext4) does not understand?

Cheers, Andreas


PS: Implementing /etc/projid support for e2fsprogs chattr/lsattr to
allow project names would also be useful, but IMHO less important.
That would be a relatively easy feature for someone to implement,
since it only involves userspace and is unlikely to get objections.




Attachments:
signature.asc (890.00 B)
Message signed with OpenPGP

2020-04-29 15:05:36

by Darrick J. Wong

[permalink] [raw]
Subject: Re: ext4 and project quotas bugs

On Tue, Apr 28, 2020 at 09:34:09PM -0600, Andreas Dilger wrote:
> On Apr 28, 2020, at 9:32 AM, Jan Kara <[email protected]> wrote:
> >
> > Hello!
> >
> > On Tue 28-04-20 08:41:59, Francois wrote:
> >> my goal is to make some kind of ansible playbook to install project
> >> quotas, so I am interested in using a tool like setquota, I also want
> >> the teams behind the capped directories, to think about a clean-up
> >> mechanism (the quota would just be a temporary annoyance for them), so
> >> it should not be "jailbreakable" too easily.
> >
> > Hum, that "not jailbreakable" part is going to be difficult unless you also
> > confine those users also in their user namespace. Because any user is
> > allowed to change project ID of the files he owns arbitrarily if he is
> > running in the initial user namespace. Project quotas have been designed as
> > an advisory feature back in Irix days... There are talks of allowing to
> > tweak the behavior (i.e., to allow setting of project id only by sysadmin)
> > by a mount option but so far nobody has implemented it.
>
> We tried to implement this for ext4, but Dave Chinner argued that
> allowing anyone (at least in the root namespace) to set the project
> ID to anything they wanted was part of how project quotas are
> _supposed_ to work.
>
> We ended up adding a restriction at the Lustre level, defaulting to
> only allow root (chprojid_gid=0, via CAP_SYS_RESOURCE), or admins in
> a specific numeric group (with chprojid_gid=N), to change the projid,
> and denying regular users the ability to change the projid of files.
>
> This can be changed by setting "chprojid_gid=-1" to allow users in
> any group to change the projid of files, returning the XFS behavior.
> The "chprojid_gid" is essentially a sysfs tunable for Lustre, but it
> could also/instead be a mount option for ext4, if that is preferred.
> I don't have a particular attachment to the parameter name, or how
> it is set by the admin, but I think something like this is needed.
>
>
> >> 2- project quota are a bit too easy to escape:
> >> dd if=/dev/zero of=someoutput oflag=append
> >> loop0: write failed, project block limit reached.
> >> dd: writing to 'someoutput': Disk quota exceeded
> >> 2467+0 records in
> >> 2466+0 records out
> >> 1262592 bytes (1.3 MB, 1.2 MiB) copied, 0.0105432 s, 120 MB/s
> >> [email protected]:/mnt/loop/abc/mydir3> chattr -p 33 someoutput
> >> [email protected]:/mnt/loop/abc/mydir3> dd if=/dev/zero of=someoutput
> >> oflag=append
> >> dd: writing to 'someoutput': No space left on device
> >> 127393+0 records in
> >> 127392+0 records out
> >> 65224704 bytes (65 MB, 62 MiB) copied, 0.568859 s, 115 MB/s
> >
> > Yes and as I mentioned above this is deliberate.
>
> That may be the historical XFS behavior, but IMHO, it doesn't make
> this behavior *useful*. If *anyone* can change the projid of files
> that makes them mostly useless. They might be OK for informational
> or accounting purposes (e.g. fast "du" of a directory) in a friendly
> user environment, but they are useless for any space management (i.e.
> anyone can easily bypass project limits by "chattr -p $RANDOM <file>").
>
> I'd prefer to make the project quotas useful out of the box for ext4,
> by implementing the chprojid_gid tunable, or something equivalent.
> If there are users/sites that want identical behavior to XFS, they
> can always set chprojid_gid=-1 to allow anyone to change the projid.
>
> I'd be happy to understand what Dave doesn't like about this proposal,
> but the last time the enforcement of project quotas was discussed, my
> attempt to figure this out ended with silence, see thread ending at:
>
> https://lore.kernel.org/linux-ext4/[email protected]/
>
> Maybe this time we can get over the hump? Is it just some implicit
> difference between "directory quota" and "project quota" that exists
> in XFS that I (and everyone using ext4) does not understand?

I don't have any particular objection to adding an admin-controlled
means to restrict who can change project ids on a file, other than let's
do this in a consistent way for the three fses that support prjquota.

Personally, I thought Dave was stating how we got to the current
prjquota implementation w/ non-entirely-intuitive Irix behavior and then
asked for a concrete definition of new behavior + patches and was
waiting to see if Wang or someone would send out f2fs/ext4/xfs patches...

--D

> Cheers, Andreas
>
>
> PS: Implementing /etc/projid support for e2fsprogs chattr/lsattr to
> allow project names would also be useful, but IMHO less important.
> That would be a relatively easy feature for someone to implement,
> since it only involves userspace and is unlikely to get objections.
>
>
>


2020-04-30 11:16:11

by Jan Kara

[permalink] [raw]
Subject: Re: ext4 and project quotas bugs

On Wed 29-04-20 12:42:01, Dave Chinner wrote:
> On Tue, Apr 28, 2020 at 06:48:24PM +0200, Jan Kara wrote:
> > On Tue 28-04-20 08:53:51, Darrick J. Wong wrote:
> > > On Tue, Apr 28, 2020 at 05:32:28PM +0200, Jan Kara wrote:
> > > > > dd if=/dev/zero of=someoutput oflag=append
> > > > > dd: failed to open 'someoutput': Invalid argument
> > > >
> > > > Yes, that's a bug that should be fixed. Thanks for reporting this! -1 means
> > > > 'this id is not expressible in current user namespace' and some code gets
> > > > confused along the way. We should refuse to set project -1 for a file...
> > >
> > > Awkward part: projid 4294967295 is allowed on XFS (at least by the
> > > kernel), though the xfs quota tools do not permit that.
> >
> > Are you OK with just refusing to set projid 4294967295 for everybody? Or
> > should we just not try to translate project IDs through user namespaces?
> > Because XFS does not seem to translate them while ext4 does... What a mess.
>
> We do not translate project IDs through user names space because
> they are not usable as a mappable id. Project IDs are only used for
> customised aggregation of space accounting, unlike UIDs and GIDS
> that are used primarily for access control. IOWs, PRIDs are
> fundamentally different to UIDs and GIDs.
>
> Project IDs were already being used in the init namespace for
> directory quotas to limit containers using bind mounts on a host
> filesystem to an amount of disk space less than the entire hosting
> filesystem. And once you use PRIDs in the init namespace, they
> cannot be used by users in other user namespaces, regardless of
> whether they are mappable or not.

OK, understood.

> Essentially, the project ID mapping stuff was implemented by someone
> who didn't understand what project IDs were or how project IDs were
> being used, and then refused to listen to the people who knew these
> things and wanted them to drop the PRID mapping stuff. And then
> Linus pulled their tree containing all the uid/gid/prid mapping code
> without warning and we've been stuck with this shit ever since.
>
> Hence in XFS we simply do not allow project IDs to be manipulated
> outside of the init user namespace, and so mapping them is
> irrelevant because users in confined namespaces cannot usefully
> interact with them in any way.

So in ext4 we also don't currently allow anybody outside init user
namespace to change project IDs. Also as I'm now checking the projid
handling in ext4 more closely, we always transform project ID only to/from
init_user_ns (even in FSGETXATTR ioctl) so it's more or less pointless and
equivalent to XFS not transforming anything AFAIU.

So the only problem is really with VFS quota code. There we do mapping of
passed project ID from current_user_ns() in fs/quota/quota.c before passing
the ID further to the core quota code. Practically, this is only relevant
for GETQUOTA quotactl calls because all the others are restricted to
init_user_ns capable CAP_SYS_ADMIN so they can get called only from
init_user_ns.

Now we also have a check like:

/* Filesystems outside of init_user_ns not yet supported */
if (sb->s_user_ns != &init_user_ns) {
error = -EINVAL;
goto out_fmt;
}

in dquot_load_quota_sb() which is the quota enabling function. So we don't
allow any quotas for filesystems outside of init_user_ns. So the
qid_has_mapping() checks are mostly pointless as sb->s_user_ns is always
init_user_ns. But this is except for id -1, which doesn't have mapping even
in init_user_ns...

So I'm pondering what's the best way out of this mess. Currently, the
mapping of project IDs in quota code has rather limited impact and we may
be able to get away with just removing it (i.e. without causing a
regression for any real user). So that's certainly one option. But then we
should probably also remove the capability to specify (non-trivial) project
ID maps for user namespaces because having maps that are not actually
applied is pretty confusing.

Then there's a second option: Is there a reason *not* to map project IDs
in user namespaces? I understand it's pointless with how project ids are
currently used but it does not harm either AFAIU. The only real harm is
with id -1 not being usable. Also when people create fs mount option where
project ID is changeable by CAP_SYS_ADMIN (or maybe CAP_SYS_RESOURCE)
capable user - and there are several people asking for a functionality like
this - then fully mapping project IDs would IMHO make more sence.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR