Prevent creation of files larger than RLIMIT_FSIZE using fallocate.
Currently using posix_fallocate one can bypass an RLIMIT_FSIZE limit
and create a file larger than the limit. Add a check for new size in
the fallocate system call.
File-systems supporting fallocate such as ext4 are affected by this
bug.
Signed-off-by: Nikanth Karthikesan <[email protected]>
Reported-by: Eelis - <[email protected]>
---
diff --git a/fs/open.c b/fs/open.c
index 74e5cd9..95ce069 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -412,10 +412,14 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
return -ENODEV;
- /* Check for wrap through zero too */
- if (((offset + len) > inode->i_sb->s_maxbytes) || ((offset + len) < 0))
+ /* Check for wrap through zero */
+ if (offset+len < 0)
return -EFBIG;
+ ret = inode_newsize_ok(inode, (offset + len));
+ if (ret)
+ return ret;
+
if (!inode->i_op->fallocate)
return -EOPNOTSUPP;
On 04/28/2010 09:24 PM, Nikanth Karthikesan Wrote:
> Prevent creation of files larger than RLIMIT_FSIZE using fallocate.
>
> Currently using posix_fallocate one can bypass an RLIMIT_FSIZE limit
> and create a file larger than the limit. Add a check for new size in
> the fallocate system call.
>
> File-systems supporting fallocate such as ext4 are affected by this
> bug.
>
> Signed-off-by: Nikanth Karthikesan <[email protected]>
> Reported-by: Eelis - <[email protected]>
>
> ---
>
> diff --git a/fs/open.c b/fs/open.c
> index 74e5cd9..95ce069 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -412,10 +412,14 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
> if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
> return -ENODEV;
>
> - /* Check for wrap through zero too */
> - if (((offset + len) > inode->i_sb->s_maxbytes) || ((offset + len) < 0))
> + /* Check for wrap through zero */
> + if (offset+len < 0)
> return -EFBIG;
>
> + ret = inode_newsize_ok(inode, (offset + len));
> + if (ret)
> + return ret;
> +
> if (!inode->i_op->fallocate)
> return -EOPNOTSUPP;
>
Hi Nikanth,
>From definition of inode_newsize_ok(), it says,
64 /**
65 * inode_newsize_ok - may this inode be truncated to a given size
66 * @inode: the inode to be truncated
67 * @offset: the new size to assign to the inode
68 * @Returns: 0 on success, -ve errno on failure
69 *
70 * inode_newsize_ok will check filesystem limits and ulimits to check that the
71 * new inode size is within limits. inode_newsize_ok will also send SIGXFSZ
72 * when necessary. Caller must not proceed with inode size change if failure is
73 * returned. @inode must be a file (not directory), with appropriate
74 * permissions to allow truncate (inode_newsize_ok does NOT check these
75 * conditions).
76 *
77 * inode_newsize_ok must be called with i_mutex held.
78 */
In execution path of do_fallocate(), it seems no i_mutex held, and inode might be directory. Using inode_newsize_ok()
might be confused ?
How about this one ?
fs/open.c | 14 ++++++++++++--
1 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/fs/open.c b/fs/open.c
index 74e5cd9..24d75a4 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -383,6 +383,7 @@ SYSCALL_ALIAS(sys_ftruncate64, SyS_ftruncate64);
int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
{
struct inode *inode = file->f_path.dentry->d_inode;
+ unsigned long limit;
long ret;
if (offset < 0 || len <= 0)
@@ -412,10 +413,19 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
return -ENODEV;
- /* Check for wrap through zero too */
- if (((offset + len) > inode->i_sb->s_maxbytes) || ((offset + len) < 0))
+ /* Check for wrap through zero */
+ if ((offset + len) < 0)
return -EFBIG;
+ /* Check for rlimit */
+ if ((offset + len) > inode->i_size) {
+ limit = rlimit(RLIMIT_FSIZE);
+ if (limit != RLIM_INFINITY && (offset + len) > limit)
+ return -EFBIG;
+ if ((offset + len) > inode->i_sb->s_maxbytes)
+ return -EFBIG;
+ }
+
if (!inode->i_op->fallocate)
return -EOPNOTSUPP;
Something I don't understand here is, why no i_mutex hold here ?
--
Coly Li
SuSE Labs
On 2010-04-29, at 00:13, Coly Li wrote:
>> + /*
>> + * Let individual file system decide if it supports
>> + * preallocation for directories or not.
>> + */
>> + if (offset > inode->i_sb->s_maxbytes)
>> + return -EFBIG;
>
> For file systems support online-resize, like ext4 and ocfs2, should we also need a locking to protect inode->i_sb->s_mabytes ?
s_maxbytes is the maximum theoretical single file size, and has nothing to do with the size of the filesystem.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
(Amit Arora <[email protected]> wrote fallocate. cc added)
On Thu, 29 Apr 2010 10:14:06 +0530
Nikanth Karthikesan <[email protected]> wrote:
> Here is an updated patch that takes the i_mutex and calls inode_newsize_ok()
> only for regular files.
err, no. It's taking i_lock where it meant to take i_mutex.
> Thanks
> Nikanth
>
> Prevent creation of files larger than RLIMIT_FSIZE using fallocate.
>
> Currently using posix_fallocate one can bypass an RLIMIT_FSIZE limit
> and create a file larger than the limit. Add a check for new size in
> the fallocate system call.
>
> File-systems supporting fallocate such as ext4 are affected by this
> bug.
>
> Signed-off-by: Nikanth Karthikesan <[email protected]>
> Reported-by: Eelis - <[email protected]>
>
> ---
>
> diff --git a/fs/open.c b/fs/open.c
> index 74e5cd9..4ca57c9 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -405,17 +405,26 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
> if (S_ISFIFO(inode->i_mode))
> return -ESPIPE;
>
> - /*
> - * Let individual file system decide if it supports preallocation
> - * for directories or not.
> - */
> - if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
> - return -ENODEV;
> -
> - /* Check for wrap through zero too */
> - if (((offset + len) > inode->i_sb->s_maxbytes) || ((offset + len) < 0))
> + /* Check for wrap through zero */
> + if (offset+len < 0)
> return -EFBIG;
I suggest that this test be moved up to where the function tests `if
(offset < 0 || len <= 0)' - it seems more logical.
Also,
- if (offset+len < 0)
+ if (offset + len < 0)
for consistency with most other kernel code, please.
> + if (S_ISREG(inode->i_mode)) {
> + spin_lock(&inode->i_lock);
> + ret = inode_newsize_ok(inode, (offset + len));
> + spin_unlock(&inode->i_lock);
> + if (ret)
> + return ret;
> + } else if (S_ISDIR(inode->i_mode)) {
> + /*
> + * Let individual file system decide if it supports
> + * preallocation for directories or not.
> + */
> + if (offset > inode->i_sb->s_maxbytes)
> + return -EFBIG;
> + } else
> + return -ENODEV;
> +
> if (!inode->i_op->fallocate)
> return -EOPNOTSUPP;
Also, there doesn't seem to be much point in doing
mutex_lock(i_mutex);
if (some_condition)
bale out
mutex_unlock(i_mutex);
<stuff>
because `some_condition' can now become true before or during the
execution of `stuff'.
IOW, it's racy.
On 2010-04-30, at 15:33, Andrew Morton wrote:
> On Thu, 29 Apr 2010 10:14:06 +0530
> Nikanth Karthikesan <[email protected]> wrote:
>> diff --git a/fs/open.c b/fs/open.c
>> index 74e5cd9..4ca57c9 100644
>> --- a/fs/open.c
>> +++ b/fs/open.c
>> @@ -405,17 +405,26 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
>> if (S_ISFIFO(inode->i_mode))
>> return -ESPIPE;
>>
>> - /*
>> - * Let individual file system decide if it supports preallocation
>> - * for directories or not.
>> - */
>> - if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
>> - return -ENODEV;
>> -
>> - /* Check for wrap through zero too */
>> - if (((offset + len) > inode->i_sb->s_maxbytes) || ((offset + len) < 0))
>> + /* Check for wrap through zero */
>> + if (offset+len < 0)
>> return -EFBIG;
>
> I suggest that this test be moved up to where the function tests `if
> (offset < 0 || len <= 0)' - it seems more logical.
Sometimes the order of these checks is mandated by POSIX because of the error return code. I'm not saying for sure that is the case here, but sometimes logic doesn't come into the specification. :-/
Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.
On Fri, Apr 30, 2010 at 02:33:19PM -0700, Andrew Morton wrote:
>
> (Amit Arora <[email protected]> wrote fallocate. cc added)
Thanks for adding me to CC.
> On Thu, 29 Apr 2010 10:14:06 +0530
> Nikanth Karthikesan <[email protected]> wrote:
>
> > Here is an updated patch that takes the i_mutex and calls inode_newsize_ok()
> > only for regular files.
>
> err, no. It's taking i_lock where it meant to take i_mutex.
>
> > Thanks
> > Nikanth
> >
> > + if (S_ISREG(inode->i_mode)) {
> > + spin_lock(&inode->i_lock);
> > + ret = inode_newsize_ok(inode, (offset + len));
> > + spin_unlock(&inode->i_lock);
> > + if (ret)
> > + return ret;
> > + } else if (S_ISDIR(inode->i_mode)) {
> > + /*
> > + * Let individual file system decide if it supports
> > + * preallocation for directories or not.
> > + */
> > + if (offset > inode->i_sb->s_maxbytes)
> > + return -EFBIG;
> > + } else
> > + return -ENODEV;
> > +
> > if (!inode->i_op->fallocate)
> > return -EOPNOTSUPP;
>
> Also, there doesn't seem to be much point in doing
>
> mutex_lock(i_mutex);
> if (some_condition)
> bale out
> mutex_unlock(i_mutex);
>
> <stuff>
>
> because `some_condition' can now become true before or during the
> execution of `stuff'.
>
> IOW, it's racy.
Agreed. How about doing this check in the filesystem specific fallocate
inode routines instead ? For example, in ext4 we could do :
diff -Nuarp linux-2.6.org/fs/ext4/extents.c linux-2.6.new/fs/ext4/extents.c
--- linux-2.6.org/fs/ext4/extents.c 2010-05-01 12:16:07.000000000 +0530
+++ linux-2.6.new/fs/ext4/extents.c 2010-05-01 12:17:37.000000000 +0530
@@ -3672,6 +3672,11 @@ long ext4_fallocate(struct inode *inode,
*/
credits = ext4_chunk_trans_blocks(inode, max_blocks);
mutex_lock(&inode->i_mutex);
+ ret = inode_newsize_ok(inode, (offset + len));
+ if (ret) {
+ mutex_unlock(&inode->i_mutex);
+ return ret;
+ }
retry:
while (ret >= 0 && ret < max_blocks) {
block = block + ret;
Similarly for ocfs2, btrfs and xfs..
--
Regards,
Amit Arora
On Sat, May 01, 2010 at 12:34:26PM +0530, Amit K. Arora wrote:
> Agreed. How about doing this check in the filesystem specific fallocate
> inode routines instead ? For example, in ext4 we could do :
That looks okay - in fact XFS should already have this check because
it re-uses the setattr implementation to set the size.
Can you submit an xfstests testcase to verify this behaviour on all
filesystems?
On Saturday 01 May 2010 12:34:26 Amit K. Arora wrote:
> On Fri, Apr 30, 2010 at 02:33:19PM -0700, Andrew Morton wrote:
> > (Amit Arora <[email protected]> wrote fallocate. cc added)
>
> Thanks for adding me to CC.
>
> > On Thu, 29 Apr 2010 10:14:06 +0530
> >
> > Nikanth Karthikesan <[email protected]> wrote:
> > > Here is an updated patch that takes the i_mutex and calls
> > > inode_newsize_ok() only for regular files.
> >
> > err, no. It's taking i_lock where it meant to take i_mutex.
> >
> > > Thanks
> > > Nikanth
> > >
> > > + if (S_ISREG(inode->i_mode)) {
> > > + spin_lock(&inode->i_lock);
> > > + ret = inode_newsize_ok(inode, (offset + len));
> > > + spin_unlock(&inode->i_lock);
> > > + if (ret)
> > > + return ret;
> > > + } else if (S_ISDIR(inode->i_mode)) {
> > > + /*
> > > + * Let individual file system decide if it supports
> > > + * preallocation for directories or not.
> > > + */
> > > + if (offset > inode->i_sb->s_maxbytes)
> > > + return -EFBIG;
> > > + } else
> > > + return -ENODEV;
> > > +
> > > if (!inode->i_op->fallocate)
> > > return -EOPNOTSUPP;
> >
> > Also, there doesn't seem to be much point in doing
> >
> > mutex_lock(i_mutex);
> > if (some_condition)
> > bale out
> > mutex_unlock(i_mutex);
> >
> > <stuff>
> >
> > because `some_condition' can now become true before or during the
> > execution of `stuff'.
> >
> > IOW, it's racy.
>
oh, yes. :(
> Agreed. How about doing this check in the filesystem specific fallocate
> inode routines instead ? For example, in ext4 we could do :
>
I guess, calling the filesystem specific fallocate with the lock held would
create lock ordering problems? If so, this might be the only way. But it would
be better to document at the call site, that the callee should check for
RLIMIT_FSIZE.
Thanks
Nikanth
> diff -Nuarp linux-2.6.org/fs/ext4/extents.c linux-2.6.new/fs/ext4/extents.c
> --- linux-2.6.org/fs/ext4/extents.c 2010-05-01 12:16:07.000000000 +0530
> +++ linux-2.6.new/fs/ext4/extents.c 2010-05-01 12:17:37.000000000 +0530
> @@ -3672,6 +3672,11 @@ long ext4_fallocate(struct inode *inode,
> */
> credits = ext4_chunk_trans_blocks(inode, max_blocks);
> mutex_lock(&inode->i_mutex);
> + ret = inode_newsize_ok(inode, (offset + len));
> + if (ret) {
> + mutex_unlock(&inode->i_mutex);
> + return ret;
> + }
> retry:
> while (ret >= 0 && ret < max_blocks) {
> block = block + ret;
>
>
> Similarly for ocfs2, btrfs and xfs..
>
> --
> Regards,
> Amit Arora
>
On Mon, May 03, 2010 at 09:53:44AM +0530, Nikanth Karthikesan wrote:
> On Saturday 01 May 2010 12:34:26 Amit K. Arora wrote:
> > On Fri, Apr 30, 2010 at 02:33:19PM -0700, Andrew Morton wrote:
> > > Also, there doesn't seem to be much point in doing
> > >
> > > mutex_lock(i_mutex);
> > > if (some_condition)
> > > bale out
> > > mutex_unlock(i_mutex);
> > >
> > > <stuff>
> > >
> > > because `some_condition' can now become true before or during the
> > > execution of `stuff'.
> > >
> > > IOW, it's racy.
>
> oh, yes. :(
>
> > Agreed. How about doing this check in the filesystem specific fallocate
> > inode routines instead ? For example, in ext4 we could do :
>
> I guess, calling the filesystem specific fallocate with the lock held would
> create lock ordering problems? If so, this might be the only way. But it would
> be better to document at the call site, that the callee should check for
> RLIMIT_FSIZE.
Hmm.. I never said to call the filesystem specific fallocate with
i_mutex held. What I suggested was that each filesystem at some point
anyhow takes the i_mutex to preallocate. Thats where the check should
be, to avoid the race. This is what the example patch below does.
--
Regards,
Amit Arora
> Thanks
> Nikanth
>
> > diff -Nuarp linux-2.6.org/fs/ext4/extents.c linux-2.6.new/fs/ext4/extents.c
> > --- linux-2.6.org/fs/ext4/extents.c 2010-05-01 12:16:07.000000000 +0530
> > +++ linux-2.6.new/fs/ext4/extents.c 2010-05-01 12:17:37.000000000 +0530
> > @@ -3672,6 +3672,11 @@ long ext4_fallocate(struct inode *inode,
> > */
> > credits = ext4_chunk_trans_blocks(inode, max_blocks);
> > mutex_lock(&inode->i_mutex);
> > + ret = inode_newsize_ok(inode, (offset + len));
> > + if (ret) {
> > + mutex_unlock(&inode->i_mutex);
> > + return ret;
> > + }
> > retry:
> > while (ret >= 0 && ret < max_blocks) {
> > block = block + ret;
> >
> >
> > Similarly for ocfs2, btrfs and xfs..
> >
> > --
> > Regards,
> > Amit Arora
> >
On Sat, May 01, 2010 at 06:18:46AM -0400, Christoph Hellwig wrote:
> On Sat, May 01, 2010 at 12:34:26PM +0530, Amit K. Arora wrote:
> > Agreed. How about doing this check in the filesystem specific fallocate
> > inode routines instead ? For example, in ext4 we could do :
>
> That looks okay - in fact XFS should already have this check because
> it re-uses the setattr implementation to set the size.
You are right. I have written a test and it passes on XFS, but fails on
ext4.
> Can you submit an xfstests testcase to verify this behaviour on all
> filesystems?
Sure. Its ready and is under test. Will post shortly..
--
Regards,
Amit Arora
On Monday 03 May 2010 12:29:45 Amit K. Arora wrote:
> On Mon, May 03, 2010 at 09:53:44AM +0530, Nikanth Karthikesan wrote:
> > On Saturday 01 May 2010 12:34:26 Amit K. Arora wrote:
> > > On Fri, Apr 30, 2010 at 02:33:19PM -0700, Andrew Morton wrote:
> > > > Also, there doesn't seem to be much point in doing
> > > >
> > > > mutex_lock(i_mutex);
> > > > if (some_condition)
> > > > bale out
> > > > mutex_unlock(i_mutex);
> > > >
> > > > <stuff>
> > > >
> > > > because `some_condition' can now become true before or during the
> > > > execution of `stuff'.
> > > >
> > > > IOW, it's racy.
> >
> > oh, yes. :(
> >
> > > Agreed. How about doing this check in the filesystem specific fallocate
> > > inode routines instead ? For example, in ext4 we could do :
> >
> > I guess, calling the filesystem specific fallocate with the lock held
> > would create lock ordering problems? If so, this might be the only way.
> > But it would be better to document at the call site, that the callee
> > should check for RLIMIT_FSIZE.
>
> Hmm.. I never said to call the filesystem specific fallocate with
> i_mutex held. What I suggested was that each filesystem at some point
> anyhow takes the i_mutex to preallocate. Thats where the check should
> be, to avoid the race. This is what the example patch below does.
>
Yes, you never said that. But I just wondered whether that would be have
problems and doing the check in filesystem specific fallocate is the only
solution. :)
Thanks
Nikanth
On Sat, May 01, 2010 at 06:18:46AM -0400, Christoph Hellwig wrote:
> On Sat, May 01, 2010 at 12:34:26PM +0530, Amit K. Arora wrote:
> > Agreed. How about doing this check in the filesystem specific fallocate
> > inode routines instead ? For example, in ext4 we could do :
>
> That looks okay - in fact XFS should already have this check because
> it re-uses the setattr implementation to set the size.
>
> Can you submit an xfstests testcase to verify this behaviour on all
> filesystems?
Here is the new testcase.
I have run this test on a x86_64 box on XFS and ext4 on 2.6.34-rc6. It
passes on XFS, but fails on ext4. Below is the snapshot of results
followed by the testcase itself.
--
Regards,
Amit Arora
Test results:
------------
# ./check 228
FSTYP -- xfs (non-debug)
PLATFORM -- Linux/x86_64 elm9m93 2.6.34-rc6
228 0s ...
Ran: 228
Passed all 1 tests
#
# umount /mnt
# mkfs.ext4 /dev/sda4 >/dev/null
mke2fs 1.41.10 (10-Feb-2009)
# ./check 228
FSTYP -- ext4
PLATFORM -- Linux/x86_64 elm9m93 2.6.34-rc6
228 0s ... - output mismatch (see 228.out.bad)
--- 228.out 2010-05-03 02:51:24.000000000 -0400
+++ 228.out.bad 2010-05-03 04:27:33.000000000 -0400
@@ -1,2 +1 @@
QA output created by 228
-File size limit exceeded (core dumped)
Ran: 228
Failures: 228
Failed 1 of 1 tests
#
Here is the test:
----------------
Add a new testcase to the xfstests suite to check if fallocate respects
the limit imposed by RLIMIT_FSIZE (can be set by "ulimit -f XXX") or
not, on a particular filesystem.
Signed-off-by: Amit Arora <[email protected]>
diff -Nuarp xfstests-dev.org/228 xfstests-dev/228
--- xfstests-dev.org/228 1969-12-31 19:00:00.000000000 -0500
+++ xfstests-dev/228 2010-05-03 02:45:10.000000000 -0400
@@ -0,0 +1,79 @@
+#! /bin/bash
+# FS QA Test No. 228
+#
+# Check if fallocate respects RLIMIT_FSIZE
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2010 IBM Corporation. All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+#
+#-----------------------------------------------------------------------
+#
+# creator
[email protected]
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+_cleanup()
+{
+ cd /
+ rm -f $tmp.*
+}
+
+here=`pwd`
+tmp=$TEST_DIR/$$
+status=1 # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15 25
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# real QA test starts here
+# generic, but xfs_io's fallocate must work
+_supported_fs generic
+# only Linux supports fallocate
+_supported_os Linux
+
+[ -n "$XFS_IO_PROG" ] || _notrun "xfs_io executable not found"
+
+rm -f $seq.full
+
+# Sanity check to see if fallocate works
+_require_xfs_io_falloc
+
+# Check if we have good enough space available
+avail=`df -P $TEST_DIR | awk 'END {print $4}'`
+[ "$avail" -ge 104000 ] || _notrun "Test device is too small ($avail KiB)"
+
+# Set the FSIZE ulimit to 100MB and check
+ulimit -f 102400
+flim=`ulimit -f`
+[ "$flim" != "unlimited" ] || _notrun "Unable to set FSIZE ulimit"
+[ "$flim" -eq 102400 ] || _notrun "FSIZE ulimit is not correct (100 MB)"
+
+# FSIZE limit is now set to 100 MB.
+# Lets try to preallocate 101 MB. This should fail.
+$XFS_IO_PROG -F -f -c 'falloc 0 101m' $TEST_DIR/ouch
+rm -f $TEST_DIR/ouch
+
+# Lets now try to preallocate 50 MB. This should succeed.
+$XFS_IO_PROG -F -f -c 'falloc 0 50m' $TEST_DIR/ouch
+rm -f $TEST_DIR/ouch
+
+# success, all done
+status=0
+exit
diff -Nuarp xfstests-dev.org/group xfstests-dev/group
--- xfstests-dev.org/group 2010-05-03 02:35:09.000000000 -0400
+++ xfstests-dev/group 2010-05-03 02:45:21.000000000 -0400
@@ -341,3 +341,4 @@ deprecated
225 auto quick
226 auto enospc
227 auto fsr
+228 rw auto prealloc quick
Prevent creation of files larger than RLIMIT_FSIZE using fallocate.
Currently using posix_fallocate one can bypass an RLIMIT_FSIZE limit
and create a file larger than the limit. Add a check for that.
Signed-off-by: Nikanth Karthikesan <[email protected]>
---
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2bfdc64..a1d8fbc 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5829,6 +5829,11 @@ static long btrfs_fallocate(struct inode *inode, int mode,
btrfs_wait_ordered_range(inode, alloc_start, alloc_end - alloc_start);
mutex_lock(&inode->i_mutex);
+
+ ret = inode_newsize_ok(inode, (offset + len));
+ if (ret)
+ goto out;
+
if (alloc_start > inode->i_size) {
ret = btrfs_cont_expand(inode, alloc_start);
if (ret)
I assumed that Amit would send a patch with s-o-b, if not, please take
this patch.
Thanks
Nikanth
Prevent creation of files larger than RLIMIT_FSIZE using fallocate.
Currently using posix_fallocate one can bypass an RLIMIT_FSIZE limit
and create a file larger than the limit. Add a check for that.
Signed-off-by: Nikanth Karthikesan <[email protected]>
---
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 236b834..39b8123 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3672,6 +3672,11 @@ long ext4_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len)
*/
credits = ext4_chunk_trans_blocks(inode, max_blocks);
mutex_lock(&inode->i_mutex);
+ ret = inode_newsize_ok(inode, (len + offset));
+ if (ret) {
+ mutex_unlock(&inode->i_mutex);
+ return ret;
+ }
retry:
while (ret >= 0 && ret < max_blocks) {
block = block + ret;
On Tue, May 04, 2010 at 11:15:04AM +0530, Nikanth Karthikesan wrote:
> I assumed that Amit would send a patch with s-o-b, if not, please take
> this patch.
Have added my sign-off below. Thanks!
--
Regards,
Amit Arora
> Thanks
> Nikanth
>
> Prevent creation of files larger than RLIMIT_FSIZE using fallocate.
>
> Currently using posix_fallocate one can bypass an RLIMIT_FSIZE limit
> and create a file larger than the limit. Add a check for that.
>
>
> Signed-off-by: Nikanth Karthikesan <[email protected]>
Signed-off-by: Amit Arora <[email protected]>
> ---
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 236b834..39b8123 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3672,6 +3672,11 @@ long ext4_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len)
> */
> credits = ext4_chunk_trans_blocks(inode, max_blocks);
> mutex_lock(&inode->i_mutex);
> + ret = inode_newsize_ok(inode, (len + offset));
> + if (ret) {
> + mutex_unlock(&inode->i_mutex);
> + return ret;
> + }
> retry:
> while (ret >= 0 && ret < max_blocks) {
> block = block + ret;
Amit K. Arora wrote:
> On Sat, May 01, 2010 at 06:18:46AM -0400, Christoph Hellwig wrote:
>> On Sat, May 01, 2010 at 12:34:26PM +0530, Amit K. Arora wrote:
>>> Agreed. How about doing this check in the filesystem specific fallocate
>>> inode routines instead ? For example, in ext4 we could do :
>> That looks okay - in fact XFS should already have this check because
>> it re-uses the setattr implementation to set the size.
>>
>> Can you submit an xfstests testcase to verify this behaviour on all
>> filesystems?
>
> Here is the new testcase.
Thanks! A few comments...
> I have run this test on a x86_64 box on XFS and ext4 on 2.6.34-rc6. It
> passes on XFS, but fails on ext4. Below is the snapshot of results
> followed by the testcase itself.
>
> --
> Regards,
> Amit Arora
>
> Test results:
> ------------
> # ./check 228
> FSTYP -- xfs (non-debug)
> PLATFORM -- Linux/x86_64 elm9m93 2.6.34-rc6
>
> 228 0s ...
> Ran: 228
> Passed all 1 tests
> #
> # umount /mnt
> # mkfs.ext4 /dev/sda4 >/dev/null
> mke2fs 1.41.10 (10-Feb-2009)
> # ./check 228
> FSTYP -- ext4
> PLATFORM -- Linux/x86_64 elm9m93 2.6.34-rc6
>
> 228 0s ... - output mismatch (see 228.out.bad)
> --- 228.out 2010-05-03 02:51:24.000000000 -0400
> +++ 228.out.bad 2010-05-03 04:27:33.000000000 -0400
> @@ -1,2 +1 @@
> QA output created by 228
> -File size limit exceeded (core dumped)
> Ran: 228
> Failures: 228
> Failed 1 of 1 tests
> #
228.out is missing from the patch
Also on my fedora box I don't get a coredump by default; can
you either make that explicit, or filter out the core message?
>
> Here is the test:
> ----------------
> Add a new testcase to the xfstests suite to check if fallocate respects
> the limit imposed by RLIMIT_FSIZE (can be set by "ulimit -f XXX") or
> not, on a particular filesystem.
...
> +# get standard environment, filters and checks
> +. ./common.rc
> +. ./common.filter
Nitpick, I don't think you need common.filter, doesn't look like you are
using it.
> +# FSIZE limit is now set to 100 MB.
> +# Lets try to preallocate 101 MB. This should fail.
> +$XFS_IO_PROG -F -f -c 'falloc 0 101m' $TEST_DIR/ouch
> +rm -f $TEST_DIR/ouch
> +
> +# Lets now try to preallocate 50 MB. This should succeed.
> +$XFS_IO_PROG -F -f -c 'falloc 0 50m' $TEST_DIR/ouch
> +rm -f $TEST_DIR/ouch
Even more nitpicky, but sometimes I think it's nice to have the .out
file be a bit more descriptive in and of itself so when you see a
failing diff you have a better idea what's gone wrong.
Changing the comments to echos, like:
+# FSIZE limit is now set to 100 MB.
+# echo "Lets try to preallocate 101 MB. This should fail."
+$XFS_IO_PROG -F -f -c 'falloc 0 101m' $TEST_DIR/ouch
+rm -f $TEST_DIR/ouch
etc ... would make a failure look like:
--- 228.out 2010-05-04 15:42:31.924278768 -0500
+++ 228.out.bad 2010-05-04 15:42:36.961278392 -0500
@@ -1,3 +1,2 @@
QA output created by 228
Lets try to preallocate 101 MB. This should fail.
-File size limit exceeded
Lets now try to preallocate 50 MB. This should succeed.
... just a thought.
Thanks,
-Eric
On Tue, May 04, 2010 at 03:44:08PM -0500, Eric Sandeen wrote:
> Amit K. Arora wrote:
> > Here is the new testcase.
> Thanks! A few comments...
Thanks for the review!
> 228.out is missing from the patch
Ok, added it in the new patch.
> Also on my fedora box I don't get a coredump by default; can
> you either make that explicit, or filter out the core message?
Hmm.. for some strange reason I am no longer seeing this message. Tried
on the same system as last time and couple of others also.
> >
> > Here is the test:
> > ----------------
> > Add a new testcase to the xfstests suite to check if fallocate respects
> > the limit imposed by RLIMIT_FSIZE (can be set by "ulimit -f XXX") or
> > not, on a particular filesystem.
>
> ...
>
> > +# get standard environment, filters and checks
> > +. ./common.rc
> > +. ./common.filter
>
> Nitpick, I don't think you need common.filter, doesn't look like you are
> using it.
Right. Removed it..
> > +# FSIZE limit is now set to 100 MB.
> > +# Lets try to preallocate 101 MB. This should fail.
> > +$XFS_IO_PROG -F -f -c 'falloc 0 101m' $TEST_DIR/ouch
> > +rm -f $TEST_DIR/ouch
> > +
> > +# Lets now try to preallocate 50 MB. This should succeed.
> > +$XFS_IO_PROG -F -f -c 'falloc 0 50m' $TEST_DIR/ouch
> > +rm -f $TEST_DIR/ouch
>
> Even more nitpicky, but sometimes I think it's nice to have the .out
> file be a bit more descriptive in and of itself so when you see a
> failing diff you have a better idea what's gone wrong.
Agreed. Done.
Here is the new patch with the changes:
Add a new testcase to the xfstests suite to check if fallocate respects
the limit imposed by RLIMIT_FSIZE (can be set by "ulimit -f XXX") or
not, on a particular filesystem.
Signed-off-by: Amit Arora <[email protected]>
diff -Nuarp xfstests-dev.org/228 xfstests-dev/228
--- xfstests-dev.org/228 1969-12-31 19:00:00.000000000 -0500
+++ xfstests-dev/228 2010-05-05 02:37:48.000000000 -0400
@@ -0,0 +1,79 @@
+#! /bin/bash
+# FS QA Test No. 228
+#
+# Check if fallocate respects RLIMIT_FSIZE
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2010 IBM Corporation. All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+#
+#-----------------------------------------------------------------------
+#
+# creator
[email protected]
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+_cleanup()
+{
+ cd /
+ rm -f $tmp.*
+}
+
+here=`pwd`
+tmp=$TEST_DIR/$$
+status=1 # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15 25
+
+# get standard environment, filters and checks
+. ./common.rc
+
+# real QA test starts here
+# generic, but xfs_io's fallocate must work
+_supported_fs generic
+# only Linux supports fallocate
+_supported_os Linux
+
+[ -n "$XFS_IO_PROG" ] || _notrun "xfs_io executable not found"
+
+rm -f $seq.full
+
+# Sanity check to see if fallocate works
+_require_xfs_io_falloc
+
+# Check if we have good enough space available
+avail=`df -P $TEST_DIR | awk 'END {print $4}'`
+[ "$avail" -ge 104000 ] || _notrun "Test device is too small ($avail KiB)"
+
+# Set the FSIZE ulimit to 100MB and check
+ulimit -f 102400
+flim=`ulimit -f`
+[ "$flim" != "unlimited" ] || _notrun "Unable to set FSIZE ulimit"
+[ "$flim" -eq 102400 ] || _notrun "FSIZE ulimit is not correct (100 MB)"
+
+echo "File size limit is now set to 100 MB."
+echo "Let us try to preallocate 101 MB. This should fail."
+$XFS_IO_PROG -F -f -c 'falloc 0 101m' $TEST_DIR/ouch
+rm -f $TEST_DIR/ouch
+
+echo "Let us now try to preallocate 50 MB. This should succeed."
+$XFS_IO_PROG -F -f -c 'falloc 0 50m' $TEST_DIR/ouch
+rm -f $TEST_DIR/ouch
+
+echo "Test over."
+# success, all done
+status=0
+exit
diff -Nuarp xfstests-dev.org/228.out xfstests-dev/228.out
--- xfstests-dev.org/228.out 1969-12-31 19:00:00.000000000 -0500
+++ xfstests-dev/228.out 2010-05-05 02:38:30.000000000 -0400
@@ -0,0 +1,6 @@
+QA output created by 228
+File size limit is now set to 100 MB.
+Let us try to preallocate 101 MB. This should fail.
+File size limit exceeded
+Let us now try to preallocate 50 MB. This should succeed.
+Test over.
diff -Nuarp xfstests-dev.org/group xfstests-dev/group
--- xfstests-dev.org/group 2010-05-03 02:35:09.000000000 -0400
+++ xfstests-dev/group 2010-05-05 02:38:00.000000000 -0400
@@ -341,3 +341,4 @@ deprecated
225 auto quick
226 auto enospc
227 auto fsr
+228 rw auto prealloc quick
On 05/05/2010 02:55 AM, Amit K. Arora wrote:
> On Tue, May 04, 2010 at 03:44:08PM -0500, Eric Sandeen wrote:
>> Amit K. Arora wrote:
>>> Here is the new testcase.
>> Thanks! A few comments...
> Thanks for the review!
Sure thing - looks good, I'll merge it after a retest if it all goes
well. :)
-Eric
On Tue, May 04, 2010 at 11:58:16AM +0530, Amit K. Arora wrote:
> On Tue, May 04, 2010 at 11:15:04AM +0530, Nikanth Karthikesan wrote:
> > I assumed that Amit would send a patch with s-o-b, if not, please take
> > this patch.
> Have added my sign-off below. Thanks!
Added to the ext4 patch queue.
- Ted