2013-12-22 09:59:29

by Zheng Liu

[permalink] [raw]
Subject: [RFC] directory quota survey on xfs

Hi all,

As discussed with ext4 folks, I will try to make ext4 file system support
directory quota (a.k.a., project id in xfs). For keeping consistency
with xfs's implementation, I did some tests on xfs, and I summarized the
result as following. This will help us understand what we can do and
what we can't do. Please correct me if I miss doing some tests or mis-
understand something.

I just do some tests about rename/hardlink because they are the key
issue from our discussion.

+ unaccounted dir
x accounted dir

rename(mv)
==========

+ -> +: ok

+ -> x: ok

I use strace(1) to lookup which syscall is used, and I found that xfs
will return EXDEV when mv(1) tries to use rename(2) to move a dir from
a unaccounted dir to a accounted dir. Then mv uses creat(2)/read(2)/
write(2) syscalls to move this dir.

x -> +: wrong (design by feature?)

If we move a dir from a accounted dir to a unaccounted dir, the quota is
still accounted. It seems that xfs don't remove project id from this
dir and its subdirs/files on this case.

x -> x: ok

Xfs returns EXDEV error when mv(1) uses rename(2) to move a dir from a
accounted dir to another accounted dir (These dirs have different
project ids). Then mv(1) uses creat(1)/read(1)/write(1) syscalls to
move this dir.

summary:
rename + x
+ ok ok (EXDEV)
x wrong ok (EXDEV)

hardlink(ln)
========

+ -> +: ok

+ -> x: error

Xfs also returns EXDEV error to forbid this operation. So that means
that we don't allow users to do a hardlink for a file from a unaccount
dir to a accounted dir.

x -> +: ok

This operation can be executed and no any error is reported. After that
the quota doesn't be changed. When both of two hardlinks are removed,
the quota will be discharged.

x -> x: error

Xfs returns EXDEV error to forbid this operation.

summary:
hardlink + x
+ ok error (EXDEV)
x ok error (EXDEV)

As always, any comment or idea are welcome.

Thanks,
- Zheng

_______________________________________________
xfs mailing list
[email protected]
http://oss.sgi.com/mailman/listinfo/xfs


2013-12-23 01:42:29

by Dave Chinner

[permalink] [raw]
Subject: Re: [RFC] directory quota survey on xfs

On Sun, Dec 22, 2013 at 05:59:29PM +0800, Zheng Liu wrote:
> Hi all,
>
> As discussed with ext4 folks, I will try to make ext4 file system support
> directory quota (a.k.a., project id in xfs).

Firstly, project quotas are not directory quotas. Project groups
are simply an aggregation of unrelated inodes with a specific
identifier (i.e. the project ID). Files accessable only to
individual users can shared project ID and hence be accounted to the
same quota, and hence accounting is independent of uid/gid.

By themselves, project quotas cannot be used to implement direct
subtree quotas - thay requires help from a special inode flag that
is used on directory inodes: XFS_DIFLAG_PROJINHERIT

This flag indicates that the directory and all inodes created in the
directory inherit the project ID of the directory. Hence the act of
creating a file in a XFS_DIFLAG_PROJINHERIT marked directory
associates the new file with s a specific project group. New
directories also get marked with XFS_DIFLAG_PROJINHERIT so the
behaviour is propagated down the directory tree.

Now, there is nothing to stop us from having files outside the
inheritance subtree from also having the same project ID, and hence
be accounted to the same project group. Indeed, you can have
multiple sub-trees that all use the same project ID and hence are
accounted together. e.g. a project has subdirs in various
directories:

/documentation/project_A
/src/project_A
/build/project_A
/test/project_A
.....
/home/bill/project_A
/home/barry/project_A
/home/benito/project_A
/home/beryl/project_A
.....

All of these directories can be set up with the same project ID
and XFS_DIFLAG_PROJINHERIT flag, and hence all be accounted to the
same project quota, despite being separate, disconnected subtrees.

IOWs, project groups are an accounting mechanism associated with the
inode's project ID, while XFS_DIFLAG_PROJINHERIT is a policy used
to direct how project IDs are applied.

> For keeping consistency
> with xfs's implementation, I did some tests on xfs, and I summarized the
> result as following. This will help us understand what we can do and
> what we can't do. Please correct me if I miss doing some tests or mis-
> understand something.
>
> I just do some tests about rename/hardlink because they are the key
> issue from our discussion.
>
> + unaccounted dir
> x accounted dir
>
> rename(mv)
> ==========
>
> + -> +: ok
>
> + -> x: ok
>
> I use strace(1) to lookup which syscall is used, and I found that xfs
> will return EXDEV when mv(1) tries to use rename(2) to move a dir from
> a unaccounted dir to a accounted dir. Then mv uses creat(2)/read(2)/
> write(2) syscalls to move this dir.

That's purely an implementation detail, designed to simplify the
change of project ID for an inode. By forcing the new inode to be
created from scratch under the destination's project ID, we don't
have to care about whether rename needs to allocate or free source
directory metadata, what anonymous metadata was accounted to the
source project ID as the srouce file and directory was modified,
etc. It's as simple as this:

/*
* If we are using project inheritance, we only allow renames
* into our tree when the project IDs are the same; else the
* tree quota mechanism would be circumvented.
*/
if (unlikely((target_dp->i_d.di_flags & XFS_DIFLAG_PROJINHERIT) &&
(xfs_get_projid(target_dp) != xfs_get_projid(src_ip)))) {
error = XFS_ERROR(EXDEV);
goto error_return;
}

It also documents the policy for accounting directory tree quotas:
that quota is accounted for when moving *into* an accounted
directory tree, not when moving out of a directory tree.

> x -> +: wrong (design by feature?)
>
> If we move a dir from a accounted dir to a unaccounted dir, the quota is
> still accounted. It seems that xfs don't remove project id from this
> dir and its subdirs/files on this case.

That's the way the directory tree policy was designed: it's designed
to allow project quotas to account for individual files as well as
subtrees. Remember: projects are not confined to a single subtree
and directory tree accounting is done when moving *into* a
controlled tree, not the other way around.

> x -> x: ok
>
> Xfs returns EXDEV error when mv(1) uses rename(2) to move a dir from a
> accounted dir to another accounted dir (These dirs have different
> project ids). Then mv(1) uses creat(1)/read(1)/write(1) syscalls to
> move this dir.
>
> summary:
> rename + x
> + ok ok (EXDEV)
> x wrong ok (EXDEV)
>
> hardlink(ln)
> ========
>
> + -> +: ok
>
> + -> x: error
>
> Xfs also returns EXDEV error to forbid this operation. So that means
> that we don't allow users to do a hardlink for a file from a unaccount
> dir to a accounted dir.

Of course - who do you account new changes to? It's the same problem
as linking across directory trees with different project IDs....

>
> x -> +: ok
>
> This operation can be executed and no any error is reported. After that
> the quota doesn't be changed. When both of two hardlinks are removed,
> the quota will be discharged.

Consistent with the rename case - checking is done based on the
destination directory - you can link out to an uncontrolled
destination as the inode is still accounted to the project ID, but
you can't link into a controlled destination with a different
project ID. The check is identical to the one I quoted for rename
above.

> As always, any comment or idea are welcome.

I'd suggest that you implement project quotas, not directory quotas.
They are way more flexible than pure directory quotas, but with only
a few lines of code and a special directory flag they can be used to
implement directory subtree quotas....

I'd also strongly suggest that you use the XFS userspace quota API
for managing project quotas, so that we can use the same management
tools and tests to verify that they behave the same. Please don't
invent a new version of the quota API to implement this - everything
you need ifor managing project/directory quotas is already there in
xfs_quota.....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2013-12-23 09:12:19

by Arkadiusz Miśkiewicz

[permalink] [raw]
Subject: Re: [RFC] directory quota survey on xfs

On Monday 23 of December 2013, Dave Chinner wrote:

> > As always, any comment or idea are welcome.
>
> I'd suggest that you implement project quotas, not directory quotas.
> They are way more flexible than pure directory quotas, but with only
> a few lines of code and a special directory flag they can be used to
> > implement directory subtree quotas....

Would be also nice to allow a file to belong to more than one project.

Let say I want to have

/projects/ with 10GB quota
and
/projects/projectA/ with 1GB quota
/projects/projectB/ with 2GB quota
and so on that's still limited by /projects/ 10GB quota limit.

(and can't use user/group quota for that since the files belong to various
users/groups)

--
Arkadiusz Miśkiewicz, arekm / maven.pl

_______________________________________________
xfs mailing list
[email protected]
http://oss.sgi.com/mailman/listinfo/xfs

2013-12-23 23:44:04

by Dave Chinner

[permalink] [raw]
Subject: Re: [RFC] directory quota survey on xfs

On Mon, Dec 23, 2013 at 10:12:19AM +0100, Arkadiusz Miśkiewicz wrote:
> On Monday 23 of December 2013, Dave Chinner wrote:
>
> > > As always, any comment or idea are welcome.
> >
> > I'd suggest that you implement project quotas, not directory quotas.
> > They are way more flexible than pure directory quotas, but with only
> > a few lines of code and a special directory flag they can be used to
> > > implement directory subtree quotas....
>
> Would be also nice to allow a file to belong to more than one project.

Not possible. Apart from there only being a single project ID to an
inode, having to account an inode ot mulitple project quotas
effectively makes every transaction in XFS have to modify an unbound
number of dquots. We don't have the infrastructure to do that, we
can't reserve log space for unbound sized transactions, etc.

> Let say I want to have
>
> /projects/ with 10GB quota
> and
> /projects/projectA/ with 1GB quota
> /projects/projectB/ with 2GB quota
> and so on that's still limited by /projects/ 10GB quota limit.

What you get is exclusive accounting - the 10GB limit on /projects/
excludes the limits set on /projects/projectA/ and
/projects/projectB/.

Think about it for a minute - if we make subtrees nest like you
suggest, then:

/projects/ with 10GB quota
/projects/projectA/ with 5GB quota
/projects/projectA/subproj1 with 3GB quota
/projects/projectA/subproj1/ssp2 with 2GB quota
/projects/projectA/subproj1/ssp2/sssp3 with 1GB quota

if we modify a file in ..../sssp3/, then we have 5 project quotas we
have to check for limit enforcement, reserve blocks on and then
transactionally modify (plus user and group for the file itself).

That's exceedingly complex because we don't have pointers to all
the inodes in the path back up to the root, so just to find that we
have nested project quotas requires a reverse path walk to find the
directory inodes to get their project IDs to look up the dquots we'd
need to modify. The complexity and performance overhead of recursive
project quota accounting simply isn't worth it.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2014-01-15 08:08:06

by Zheng Liu

[permalink] [raw]
Subject: Re: [RFC] directory quota survey on xfs

Hi Dave,

On Mon, Dec 23, 2013 at 12:42:22PM +1100, Dave Chinner wrote:
> On Sun, Dec 22, 2013 at 05:59:29PM +0800, Zheng Liu wrote:
> > Hi all,
> >
> > As discussed with ext4 folks, I will try to make ext4 file system support
> > directory quota (a.k.a., project id in xfs).
>
> Firstly, project quotas are not directory quotas. Project groups
> are simply an aggregation of unrelated inodes with a specific
> identifier (i.e. the project ID). Files accessable only to
> individual users can shared project ID and hence be accounted to the
> same quota, and hence accounting is independent of uid/gid.
>
> By themselves, project quotas cannot be used to implement direct
> subtree quotas - thay requires help from a special inode flag that
> is used on directory inodes: XFS_DIFLAG_PROJINHERIT
>
> This flag indicates that the directory and all inodes created in the
> directory inherit the project ID of the directory. Hence the act of
> creating a file in a XFS_DIFLAG_PROJINHERIT marked directory
> associates the new file with s a specific project group. New
> directories also get marked with XFS_DIFLAG_PROJINHERIT so the
> behaviour is propagated down the directory tree.
>
> Now, there is nothing to stop us from having files outside the
> inheritance subtree from also having the same project ID, and hence
> be accounted to the same project group. Indeed, you can have
> multiple sub-trees that all use the same project ID and hence are
> accounted together. e.g. a project has subdirs in various
> directories:
>
> /documentation/project_A
> /src/project_A
> /build/project_A
> /test/project_A
> .....
> /home/bill/project_A
> /home/barry/project_A
> /home/benito/project_A
> /home/beryl/project_A
> .....
>
> All of these directories can be set up with the same project ID
> and XFS_DIFLAG_PROJINHERIT flag, and hence all be accounted to the
> same project quota, despite being separate, disconnected subtrees.
>
> IOWs, project groups are an accounting mechanism associated with the
> inode's project ID, while XFS_DIFLAG_PROJINHERIT is a policy used
> to direct how project IDs are applied.
>
> > For keeping consistency
> > with xfs's implementation, I did some tests on xfs, and I summarized the
> > result as following. This will help us understand what we can do and
> > what we can't do. Please correct me if I miss doing some tests or mis-
> > understand something.
> >
> > I just do some tests about rename/hardlink because they are the key
> > issue from our discussion.
> >
> > + unaccounted dir
> > x accounted dir
> >
> > rename(mv)
> > ==========
> >
> > + -> +: ok
> >
> > + -> x: ok
> >
> > I use strace(1) to lookup which syscall is used, and I found that xfs
> > will return EXDEV when mv(1) tries to use rename(2) to move a dir from
> > a unaccounted dir to a accounted dir. Then mv uses creat(2)/read(2)/
> > write(2) syscalls to move this dir.
>
> That's purely an implementation detail, designed to simplify the
> change of project ID for an inode. By forcing the new inode to be
> created from scratch under the destination's project ID, we don't
> have to care about whether rename needs to allocate or free source
> directory metadata, what anonymous metadata was accounted to the
> source project ID as the srouce file and directory was modified,
> etc. It's as simple as this:
>
> /*
> * If we are using project inheritance, we only allow renames
> * into our tree when the project IDs are the same; else the
> * tree quota mechanism would be circumvented.
> */
> if (unlikely((target_dp->i_d.di_flags & XFS_DIFLAG_PROJINHERIT) &&
> (xfs_get_projid(target_dp) != xfs_get_projid(src_ip)))) {
> error = XFS_ERROR(EXDEV);
> goto error_return;
> }
>
> It also documents the policy for accounting directory tree quotas:
> that quota is accounted for when moving *into* an accounted
> directory tree, not when moving out of a directory tree.
>
> > x -> +: wrong (design by feature?)
> >
> > If we move a dir from a accounted dir to a unaccounted dir, the quota is
> > still accounted. It seems that xfs don't remove project id from this
> > dir and its subdirs/files on this case.
>
> That's the way the directory tree policy was designed: it's designed
> to allow project quotas to account for individual files as well as
> subtrees. Remember: projects are not confined to a single subtree
> and directory tree accounting is done when moving *into* a
> controlled tree, not the other way around.
>
> > x -> x: ok
> >
> > Xfs returns EXDEV error when mv(1) uses rename(2) to move a dir from a
> > accounted dir to another accounted dir (These dirs have different
> > project ids). Then mv(1) uses creat(1)/read(1)/write(1) syscalls to
> > move this dir.
> >
> > summary:
> > rename + x
> > + ok ok (EXDEV)
> > x wrong ok (EXDEV)
> >
> > hardlink(ln)
> > ========
> >
> > + -> +: ok
> >
> > + -> x: error
> >
> > Xfs also returns EXDEV error to forbid this operation. So that means
> > that we don't allow users to do a hardlink for a file from a unaccount
> > dir to a accounted dir.
>
> Of course - who do you account new changes to? It's the same problem
> as linking across directory trees with different project IDs....
>
> >
> > x -> +: ok
> >
> > This operation can be executed and no any error is reported. After that
> > the quota doesn't be changed. When both of two hardlinks are removed,
> > the quota will be discharged.
>
> Consistent with the rename case - checking is done based on the
> destination directory - you can link out to an uncontrolled
> destination as the inode is still accounted to the project ID, but
> you can't link into a controlled destination with a different
> project ID. The check is identical to the one I quoted for rename
> above.
>
> > As always, any comment or idea are welcome.
>
> I'd suggest that you implement project quotas, not directory quotas.
> They are way more flexible than pure directory quotas, but with only
> a few lines of code and a special directory flag they can be used to
> implement directory subtree quotas....

Sorry for the delay. I really appreciate your detail explanation.

Personally, I agree with you that we implement a project quota
in ext4 and add a flag to support directory quota in order to keep
consistency with xfs. But this still needs to be discussed with other
ext4 folks. Later I will write a draft to describe my idea about
project quota in ext4. That will let me collect more comments and
suggestions.

>
> I'd also strongly suggest that you use the XFS userspace quota API
> for managing project quotas, so that we can use the same management
> tools and tests to verify that they behave the same. Please don't
> invent a new version of the quota API to implement this - everything
> you need ifor managing project/directory quotas is already there in
> xfs_quota.....

Frankly, I don't like this, really. Now we have quota-tool to manage
the quota in ext4. So IMHO we'd better go on using this tool because it
is natural for ext4 users. I still couldn't accept this fact that I
need to install xfsprogs for using a feature of ext4. Further, it could
make users puzzled because they use quota to control user/group quota in
ext4, but it uses xfs_quota to control project quota. It could bring
some troubles for the ext4 users who have written some scripts to manage
their machines.

Thanks,
- Zheng

2014-01-15 18:03:22

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC] directory quota survey on xfs

On Jan 15, 2014, at 1:12 AM, Zheng Liu <[email protected]> wrote:
> On Mon, Dec 23, 2013 at 12:42:22PM +1100, Dave Chinner wrote:
>> I'd also strongly suggest that you use the XFS userspace quota API
>> for managing project quotas, so that we can use the same management
>> tools and tests to verify that they behave the same. Please don't
>> invent a new version of the quota API to implement this - everything
>> you need ifor managing project/directory quotas is already there in
>> xfs_quota.....
>
> Frankly, I don't like this, really. Now we have quota-tool to manage
> the quota in ext4. So IMHO we'd better go on using this tool because it
> is natural for ext4 users. I still couldn't accept this fact that I
> need to install xfsprogs for using a feature of ext4. Further, it could
> make users puzzled because they use quota to control user/group quota in
> ext4, but it uses xfs_quota to control project quota. It could bring
> some troubles for the ext4 users who have written some scripts to manage
> their machines.

Please see Li Xi's recent email "Directory/Project quota supports" on
the linux-ext4 list. He has already added some prototype support for
project quotas to quota-tools.

I think it might make sense to keep the same API as XFS for the ext4
quotas (to keep compatibility for existing XFS deployments), but add
support into quota-tools so that it is usable by all filesystems.

Cheers, Andreas






Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP using GPGMail
(No filename) (121.00 B)
Download all attachments

2014-01-15 21:32:07

by Dave Chinner

[permalink] [raw]
Subject: Re: [RFC] directory quota survey on xfs

On Wed, Jan 15, 2014 at 11:03:22AM -0700, Andreas Dilger wrote:
> On Jan 15, 2014, at 1:12 AM, Zheng Liu <[email protected]> wrote:
> > On Mon, Dec 23, 2013 at 12:42:22PM +1100, Dave Chinner wrote:
> >> I'd also strongly suggest that you use the XFS userspace quota API
> >> for managing project quotas, so that we can use the same management
> >> tools and tests to verify that they behave the same. Please don't
> >> invent a new version of the quota API to implement this - everything
> >> you need ifor managing project/directory quotas is already there in
> >> xfs_quota.....
> >
> > Frankly, I don't like this, really. Now we have quota-tool to manage
> > the quota in ext4. So IMHO we'd better go on using this tool because it
> > is natural for ext4 users.

Zheng - you're confusing the userspace tool that users run with
the quotactl API the tool uses to communicate with the kernel.

> > I still couldn't accept this fact that I
> > need to install xfsprogs for using a feature of ext4. Further, it could
> > make users puzzled because they use quota to control user/group quota in
> > ext4, but it uses xfs_quota to control project quota. It could bring
> > some troubles for the ext4 users who have written some scripts to manage
> > their machines.
>
> Please see Li Xi's recent email "Directory/Project quota supports" on
> the linux-ext4 list. He has already added some prototype support for
> project quotas to quota-tools.

So, while it is a prototype, lets do it the right way. i.e. let's
not reinvent the wheel.

> I think it might make sense to keep the same API as XFS for the ext4
> quotas (to keep compatibility for existing XFS deployments), but add
> support into quota-tools so that it is usable by all filesystems.

Well, yes. If you are writing a generic quota tool, then it needs to
support all filesystems. We already have a fully featured quota API
that can provide this support - it's the API that XFS has been using
since it was ported to Linux. We have the opportunity to unify the
quota APIs that ext4 and XFS, so we should take the opportunity
while it is here. Don't create a new API for ext4 simply because of
NIH syndrome.

Cheers,

Dave.
--
Dave Chinner
[email protected]

_______________________________________________
xfs mailing list
[email protected]
http://oss.sgi.com/mailman/listinfo/xfs

2014-01-21 07:02:44

by Zheng Liu

[permalink] [raw]
Subject: Re: [RFC] directory quota survey on xfs

Hi Dave,

On Thu, Jan 16, 2014 at 08:32:07AM +1100, Dave Chinner wrote:
> On Wed, Jan 15, 2014 at 11:03:22AM -0700, Andreas Dilger wrote:
> > On Jan 15, 2014, at 1:12 AM, Zheng Liu <[email protected]> wrote:
> > > On Mon, Dec 23, 2013 at 12:42:22PM +1100, Dave Chinner wrote:
> > >> I'd also strongly suggest that you use the XFS userspace quota API
> > >> for managing project quotas, so that we can use the same management
> > >> tools and tests to verify that they behave the same. Please don't
> > >> invent a new version of the quota API to implement this - everything
> > >> you need ifor managing project/directory quotas is already there in
> > >> xfs_quota.....
> > >
> > > Frankly, I don't like this, really. Now we have quota-tool to manage
> > > the quota in ext4. So IMHO we'd better go on using this tool because it
> > > is natural for ext4 users.
>
> Zheng - you're confusing the userspace tool that users run with
> the quotactl API the tool uses to communicate with the kernel.

Thanks for pointing it out.

>
> > > I still couldn't accept this fact that I
> > > need to install xfsprogs for using a feature of ext4. Further, it could
> > > make users puzzled because they use quota to control user/group quota in
> > > ext4, but it uses xfs_quota to control project quota. It could bring
> > > some troubles for the ext4 users who have written some scripts to manage
> > > their machines.
> >
> > Please see Li Xi's recent email "Directory/Project quota supports" on
> > the linux-ext4 list. He has already added some prototype support for
> > project quotas to quota-tools.
>
> So, while it is a prototype, lets do it the right way. i.e. let's
> not reinvent the wheel.

Yes, I agree with you and Andreas that we shouldn't reinvent the wheel.

>
> > I think it might make sense to keep the same API as XFS for the ext4
> > quotas (to keep compatibility for existing XFS deployments), but add
> > support into quota-tools so that it is usable by all filesystems.
>
> Well, yes. If you are writing a generic quota tool, then it needs to
> support all filesystems. We already have a fully featured quota API
> that can provide this support - it's the API that XFS has been using
> since it was ported to Linux. We have the opportunity to unify the
> quota APIs that ext4 and XFS, so we should take the opportunity
> while it is here. Don't create a new API for ext4 simply because of
> NIH syndrome.

These days I was thinking about your comment that uses quotactl API to
communicate the userspace tool with the kernel. But I am still
confusing about your comment that unifies the quota API between ext4 and
XFS.

Now we have two flag sets in quotactl(2). One (Q_QUOTAON, Q_GETQUOTA,
etc...) is used by extN file system (I am not sure whether other file
systems use these flags or not), and another (Q_XQUOTAON, Q_XGETQSTAT,
etc...) is used by XFS.

In xfs_quota it uses an ioctl(2) to get/set/check project id, and calls
quotactl(2) with Q_XSETQLIM/Q_XGETQUOTA to set/get project quota. On
kernel side, ->set_dqblk()/get_dqblk() is called when we try to set/get
project quota in XFS. In ext4 the same callback functions are used to
set/get user/group quota, although on userspace we use Q_SETQUOTA/
Q_GETQUOTA to set/get quota. I am not sure I fully understand your
meaning that unifies the quota API between ext4 and XFS. Do you mean
that we should use Q_XSETQLIM/Q_XGETQUOTA flags to set/get quota on ext4?
Or using quotactl(2) is fine for you.

Please correct me if I miss something.

Thanks,
- Zheng

2014-01-23 22:35:38

by Dave Chinner

[permalink] [raw]
Subject: Re: [RFC] directory quota survey on xfs

On Tue, Jan 21, 2014 at 03:07:06PM +0800, Zheng Liu wrote:
> Hi Dave,
>
> On Thu, Jan 16, 2014 at 08:32:07AM +1100, Dave Chinner wrote:
> > On Wed, Jan 15, 2014 at 11:03:22AM -0700, Andreas Dilger wrote:
> > > quotas (to keep compatibility for existing XFS deployments), but add
> > > support into quota-tools so that it is usable by all filesystems.
> >
> > Well, yes. If you are writing a generic quota tool, then it needs to
> > support all filesystems. We already have a fully featured quota API
> > that can provide this support - it's the API that XFS has been using
> > since it was ported to Linux. We have the opportunity to unify the
> > quota APIs that ext4 and XFS, so we should take the opportunity
> > while it is here. Don't create a new API for ext4 simply because of
> > NIH syndrome.
>
> These days I was thinking about your comment that uses quotactl API to
> communicate the userspace tool with the kernel. But I am still
> confusing about your comment that unifies the quota API between ext4 and
> XFS.
>
> Now we have two flag sets in quotactl(2). One (Q_QUOTAON, Q_GETQUOTA,
> etc...) is used by extN file system (I am not sure whether other file
> systems use these flags or not), and another (Q_XQUOTAON, Q_XGETQSTAT,
> etc...) is used by XFS.

I'm talking about making ext4 be able to use Q_XQUOTAON,
Q_XGETQSTAT, etc.

> In xfs_quota it uses an ioctl(2) to get/set/check project id,

Right, because that's a filesystem specific operation that has no
equivalent in any other filesystem at this point in time. Same for
the project inheritence inode flag.

You're going to need to add such an interface to ext4 to do this, so
add a generic ioctl and wire XFS up to it as well. This is kind
of why I want a generic xattr namespace for inode flags/attribute
at the VFS - so we don't have to keep inventing new
ioctl/fcntl interfaces to make this sort of functionality common
between different filesystems - we just define a new attribute
string and values and let individual filesystems handle how they
store them.

> and calls
> quotactl(2) with Q_XSETQLIM/Q_XGETQUOTA to set/get project quota.

Right, that's the quota management interface - it can also be used
to manage user and group quotas, so in userspace you can just use
the one interface for everything

> On
> kernel side, ->set_dqblk()/get_dqblk() is called when we try to set/get
> project quota in XFS.

And user/group quotas, too.

> In ext4 the same callback functions are used to
> set/get user/group quota, although on userspace we use Q_SETQUOTA/
> Q_GETQUOTA to set/get quota. I am not sure I fully understand your
> meaning that unifies the quota API between ext4 and XFS. Do you mean
> that we should use Q_XSETQLIM/Q_XGETQUOTA flags to set/get quota on ext4?

What I mean is that you should have quotatool speak the Q_X* quota
protocol defined in include/uapi/linux/dqblk_xfs.h via quotactl(2)
if the underlying filesystem supports it. Indeed, quotatool already
detects XFS filesystems and switches to using the Q_X* interface
automatically, so there shouldn't be a huge amount of change needed
in it to support project quotas via this interface, nor add ext4
support to use the interface.

Then for project quota support in ext4, you implement the Q_X* quota
ops methods for that protocol similar method to xfs in
fs/xfs/xfs_quotaops.c. That way ext4 will be able to speak both the
current v0-v2 protocols (so it doesn't break userspace compatibility
with older binaries), and the userspace quotatool will be able to
fully manage project quotas on both XFS and ext4 filesystems in a
common manner.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2014-01-24 00:07:02

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [RFC] directory quota survey on xfs

On Fri, Jan 24, 2014 at 09:32:05AM +1100, Dave Chinner wrote:
> On Tue, Jan 21, 2014 at 03:07:06PM +0800, Zheng Liu wrote:
> > Hi Dave,
> >
> > On Thu, Jan 16, 2014 at 08:32:07AM +1100, Dave Chinner wrote:
> > > On Wed, Jan 15, 2014 at 11:03:22AM -0700, Andreas Dilger wrote:
> > > > quotas (to keep compatibility for existing XFS deployments), but add
> > > > support into quota-tools so that it is usable by all filesystems.
> > >
> > > Well, yes. If you are writing a generic quota tool, then it needs to
> > > support all filesystems. We already have a fully featured quota API
> > > that can provide this support - it's the API that XFS has been using
> > > since it was ported to Linux. We have the opportunity to unify the
> > > quota APIs that ext4 and XFS, so we should take the opportunity
> > > while it is here. Don't create a new API for ext4 simply because of
> > > NIH syndrome.
> >
> > These days I was thinking about your comment that uses quotactl API to
> > communicate the userspace tool with the kernel. But I am still
> > confusing about your comment that unifies the quota API between ext4 and
> > XFS.
> >
> > Now we have two flag sets in quotactl(2). One (Q_QUOTAON, Q_GETQUOTA,
> > etc...) is used by extN file system (I am not sure whether other file
> > systems use these flags or not), and another (Q_XQUOTAON, Q_XGETQSTAT,
> > etc...) is used by XFS.
>
> I'm talking about making ext4 be able to use Q_XQUOTAON,
> Q_XGETQSTAT, etc.
>
> > In xfs_quota it uses an ioctl(2) to get/set/check project id,
>
> Right, because that's a filesystem specific operation that has no
> equivalent in any other filesystem at this point in time. Same for
> the project inheritence inode flag.
>
> You're going to need to add such an interface to ext4 to do this, so
> add a generic ioctl and wire XFS up to it as well. This is kind
> of why I want a generic xattr namespace for inode flags/attribute
> at the VFS - so we don't have to keep inventing new
> ioctl/fcntl interfaces to make this sort of functionality common
> between different filesystems - we just define a new attribute
> string and values and let individual filesystems handle how they
> store them.

I wonder, do you have an opinion on my patches to do just that? Since it was
an RFC I only wired up ext4; first with string-based xattrs[1] and again as a
(namespace, flags) integer tuple[2]. Jan disliked juggling strings around
(they don't bring me oodles of happiness either), Christoph doesn't like the
magic xattrs, and Ted seemed lukewarm so I'm inclined to fix up the other FSes
a la [2].

(...and please everybody don't co-opt this thread for inode flags any more than
I have.)

--D

[1] http://lkml.org/lkml/2014/1/6/1059 (fs: xattr-based FS_IOC_[GS]ETFLAGS interface)
[2] http://lkml.org/lkml/2014/1/7/534 (fs: new FS_IOC_[GS]ETFLAGS2 interface)