From: Marco Stornelli <[email protected]>
All fs must check for the immutable flag in their fallocate callback.
It's possible to have a race condition in this scenario: an application
open a file in read/write and it does something, meanwhile root set the
immutable flag on the file, the application at that point can call
fallocate with success. Only Ocfs2 check for the immutable flag at the
moment.
Signed-off-by: Marco Stornelli <[email protected]>
---
Patch is against 2.6.38-rc5
--- linux-2.6.38-rc5-orig/fs/ext4/extents.c 2011-02-16 04:23:45.000000000 +0100
+++ linux-2.6.38-rc5/fs/ext4/extents.c 2011-02-21 08:43:37.000000000 +0100
@@ -3670,6 +3670,12 @@ long ext4_fallocate(struct file *file, i
*/
credits = ext4_chunk_trans_blocks(inode, max_blocks);
mutex_lock(&inode->i_mutex);
+
+ if (IS_IMMUTABLE(inode)) {
+ mutex_unlock(&inode->i_mutex);
+ return -EPERM;
+ }
+
ret = inode_newsize_ok(inode, (len + offset));
if (ret) {
mutex_unlock(&inode->i_mutex);
--- linux-2.6.38-rc5-orig/fs/btrfs/file.c 2011-02-16 04:23:45.000000000 +0100
+++ linux-2.6.38-rc5/fs/btrfs/file.c 2011-02-21 08:55:58.000000000 +0100
@@ -1289,6 +1289,12 @@ static long btrfs_fallocate(struct file
btrfs_wait_ordered_range(inode, alloc_start, alloc_end - alloc_start);
mutex_lock(&inode->i_mutex);
+
+ if (IS_IMMUTABLE(inode)) {
+ ret = -EPERM;
+ goto out;
+ }
+
ret = inode_newsize_ok(inode, alloc_end);
if (ret)
goto out;
--- linux-2.6.38-rc5-orig/fs/xfs/linux-2.6/xfs_file.c 2011-02-16 04:23:45.000000000 +0100
+++ linux-2.6.38-rc5/fs/xfs/linux-2.6/xfs_file.c 2011-02-21 09:07:46.000000000 +0100
@@ -909,6 +909,11 @@ xfs_file_fallocate(
if (mode & FALLOC_FL_PUNCH_HOLE)
cmd = XFS_IOC_UNRESVSP;
+ if (IS_IMMUTABLE(inode)) {
+ error = -EPERM;
+ goto out_unlock;
+ }
+
/* check the new inode size is valid before allocating */
if (!(mode & FALLOC_FL_KEEP_SIZE) &&
offset + len > i_size_read(inode)) {
--- linux-2.6.38-rc5-orig/fs/gfs2/file.c 2011-02-16 04:23:45.000000000 +0100
+++ linux-2.6.38-rc5/fs/gfs2/file.c 2011-02-21 09:09:17.000000000 +0100
@@ -797,6 +797,11 @@ static long gfs2_fallocate(struct file *
if (unlikely(error))
goto out_uninit;
+ if (IS_IMMUTABLE(inode)) {
+ error = -EPERM;
+ goto out_unlock;
+ }
+
if (!gfs2_write_alloc_required(ip, offset, len))
goto out_unlock;
On Mon, Feb 21, 2011 at 09:26:32AM +0100, Marco Stornelli wrote:
> From: Marco Stornelli <[email protected]>
>
> All fs must check for the immutable flag in their fallocate callback.
> It's possible to have a race condition in this scenario: an application
> open a file in read/write and it does something, meanwhile root set the
> immutable flag on the file, the application at that point can call
> fallocate with success. Only Ocfs2 check for the immutable flag at the
> moment.
Please add the check in fs/open.c:do_fallocate() so that it covers all
filesystems.
2011/2/21 Christoph Hellwig <[email protected]>:
> On Mon, Feb 21, 2011 at 09:26:32AM +0100, Marco Stornelli wrote:
>> From: Marco Stornelli <[email protected]>
>>
>> All fs must check for the immutable flag in their fallocate callback.
>> It's possible to have a race condition in this scenario: an application
>> open a file in read/write and it does something, meanwhile root set the
>> immutable flag on the file, the application at that point can call
>> fallocate with success. Only Ocfs2 check for the immutable flag at the
>> moment.
>
> Please add the check in fs/open.c:do_fallocate() so that it covers all
> filesystems.
>
>
The check should be done after the fs got the inode mutex lock.
Marco
Il 21/02/2011 09:26, Marco Stornelli ha scritto:
> From: Marco Stornelli <[email protected]>
>
> All fs must check for the immutable flag in their fallocate callback.
> It's possible to have a race condition in this scenario: an application
> open a file in read/write and it does something, meanwhile root set the
> immutable flag on the file, the application at that point can call
> fallocate with success. Only Ocfs2 check for the immutable flag at the
> moment.
>
> Signed-off-by: Marco Stornelli <[email protected]>
no comments?
On Mon, Feb 21, 2011 at 05:50:21PM +0100, Marco Stornelli wrote:
> 2011/2/21 Christoph Hellwig <[email protected]>:
> > On Mon, Feb 21, 2011 at 09:26:32AM +0100, Marco Stornelli wrote:
> >> From: Marco Stornelli <[email protected]>
> >>
> >> All fs must check for the immutable flag in their fallocate callback.
> >> It's possible to have a race condition in this scenario: an application
> >> open a file in read/write and it does something, meanwhile root set the
> >> immutable flag on the file, the application at that point can call
> >> fallocate with success. Only Ocfs2 check for the immutable flag at the
> >> moment.
> >
> > Please add the check in fs/open.c:do_fallocate() so that it covers all
> > filesystems.
> >
> >
>
> The check should be done after the fs got the inode mutex lock.
Why? None of the other places which check the IMMUTABLE flag do so
under the inode mutex lock. Yes, it's true that we're not properly
doing proper locking when updating i_flags from the ioctl (this is
true for all file systems), but this has been true for quite some
time, and using a mutex to protect bit set/clear/test operations would
be like using a sledgehammer to kill a fly.
A proper fix if we want to be completely correct about updates to
i_flags would involve using test_bit, set_bit, and clear_bit, which is
guaranteed to be atomic. This is how we update the
ext4_inode_info->i_flags (which is different from inode->i_flags) (see
the definition and use of EXT4_INODE_BIT_FNS in fs/ext4/ext4.h).
At some point, it would be good to fix how we set/get i_flags values,
but that's independent of the change that's being discussed here.
- Ted
2011/2/27 Ted Ts'o <[email protected]>:
> On Mon, Feb 21, 2011 at 05:50:21PM +0100, Marco Stornelli wrote:
>> 2011/2/21 Christoph Hellwig <[email protected]>:
>> > On Mon, Feb 21, 2011 at 09:26:32AM +0100, Marco Stornelli wrote:
>> >> From: Marco Stornelli <[email protected]>
>> >>
>> >> All fs must check for the immutable flag in their fallocate callback.
>> >> It's possible to have a race condition in this scenario: an application
>> >> open a file in read/write and it does something, meanwhile root set the
>> >> immutable flag on the file, the application at that point can call
>> >> fallocate with success. Only Ocfs2 check for the immutable flag at the
>> >> moment.
>> >
>> > Please add the check in fs/open.c:do_fallocate() so that it covers all
>> > filesystems.
>> >
>> >
>>
>> The check should be done after the fs got the inode mutex lock.
>
> Why? ?None of the other places which check the IMMUTABLE flag do so
> under the inode mutex lock. ?Yes, it's true that we're not properly
> doing proper locking when updating i_flags from the ioctl (this is
> true for all file systems), but this has been true for quite some
> time, and using a mutex to protect bit set/clear/test operations would
> be like using a sledgehammer to kill a fly.
>
> A proper fix if we want to be completely correct about updates to
> i_flags would involve using test_bit, set_bit, and clear_bit, which is
> guaranteed to be atomic. ?This is how we update the
> ext4_inode_info->i_flags (which is different from inode->i_flags) (see
> the definition and use of EXT4_INODE_BIT_FNS in fs/ext4/ext4.h).
>
> At some point, it would be good to fix how we set/get i_flags values,
> but that's independent of the change that's being discussed here.
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
>
I was thinking to the possible race with setattr callback.
Marco
Il 27/02/2011 23:49, Ted Ts'o ha scritto:
> On Mon, Feb 21, 2011 at 05:50:21PM +0100, Marco Stornelli wrote:
>> 2011/2/21 Christoph Hellwig <[email protected]>:
>>> On Mon, Feb 21, 2011 at 09:26:32AM +0100, Marco Stornelli wrote:
>>>> From: Marco Stornelli <[email protected]>
>>>>
>>>> All fs must check for the immutable flag in their fallocate callback.
>>>> It's possible to have a race condition in this scenario: an application
>>>> open a file in read/write and it does something, meanwhile root set the
>>>> immutable flag on the file, the application at that point can call
>>>> fallocate with success. Only Ocfs2 check for the immutable flag at the
>>>> moment.
>>>
>>> Please add the check in fs/open.c:do_fallocate() so that it covers all
>>> filesystems.
>>>
>>>
>>
>> The check should be done after the fs got the inode mutex lock.
>
> Why? None of the other places which check the IMMUTABLE flag do so
I add to my previous response an other point: IMHO each fs should check
for it because after the inclusion of punch hole patch, the fs
can/cannot check for the append-only flag. So XFS (it supports the
"unreserve") should check even for append. I think we don't want to
allow this operation for an append-only file, isn't it? About this point
I'll update and resend my patch.
Marco
From: Marco Stornelli <[email protected]>
All fs must check for the immutable flag in their fallocate callback.
It's possible to have a race condition in this scenario: an application
open a file in read/write and it does something, meanwhile root set the
immutable flag on the file, the application at that point can call
fallocate with success. Only Ocfs2 check for the immutable flag at the
moment.
Signed-off-by: Marco Stornelli <[email protected]>
---
Patch is against 2.6.38-rc5
ChangeLog
v2: Added the check for append-only file for XFS
v1: First draft
--- linux-2.6.38-rc5-orig/fs/ext4/extents.c 2011-02-16 04:23:45.000000000 +0100
+++ linux-2.6.38-rc5/fs/ext4/extents.c 2011-02-21 08:43:37.000000000 +0100
@@ -3670,6 +3670,12 @@ long ext4_fallocate(struct file *file, i
*/
credits = ext4_chunk_trans_blocks(inode, max_blocks);
mutex_lock(&inode->i_mutex);
+
+ if (IS_IMMUTABLE(inode)) {
+ mutex_unlock(&inode->i_mutex);
+ return -EPERM;
+ }
+
ret = inode_newsize_ok(inode, (len + offset));
if (ret) {
mutex_unlock(&inode->i_mutex);
--- linux-2.6.38-rc5-orig/fs/btrfs/file.c 2011-02-16 04:23:45.000000000 +0100
+++ linux-2.6.38-rc5/fs/btrfs/file.c 2011-02-21 08:55:58.000000000 +0100
@@ -1289,6 +1289,12 @@ static long btrfs_fallocate(struct file
btrfs_wait_ordered_range(inode, alloc_start, alloc_end - alloc_start);
mutex_lock(&inode->i_mutex);
+
+ if (IS_IMMUTABLE(inode)) {
+ ret = -EPERM;
+ goto out;
+ }
+
ret = inode_newsize_ok(inode, alloc_end);
if (ret)
goto out;
--- linux-2.6.38-rc5-orig/fs/gfs2/file.c 2011-02-16 04:23:45.000000000 +0100
+++ linux-2.6.38-rc5/fs/gfs2/file.c 2011-02-21 09:09:17.000000000 +0100
@@ -797,6 +797,11 @@ static long gfs2_fallocate(struct file *
if (unlikely(error))
goto out_uninit;
+ if (IS_IMMUTABLE(inode)) {
+ error = -EPERM;
+ goto out_unlock;
+ }
+
if (!gfs2_write_alloc_required(ip, offset, len))
goto out_unlock;
--- ./linux-2.6.38-rc5/fs/xfs/linux-2.6/xfs_file.c 2011-02-16 04:23:45.000000000 +0100
+++ ./linux-2.6.38-rc5/fs/xfs/linux-2.6/xfs_file.c 2011-03-03 09:25:32.000000000 +0100
@@ -906,8 +906,18 @@ xfs_file_fallocate(
xfs_ilock(ip, XFS_IOLOCK_EXCL);
- if (mode & FALLOC_FL_PUNCH_HOLE)
+ if (mode & FALLOC_FL_PUNCH_HOLE) {
cmd = XFS_IOC_UNRESVSP;
+ if (IS_APPEND(inode)) {
+ error = -EPERM;
+ goto out_unlock;
+ }
+ }
+
+ if (IS_IMMUTABLE(inode)) {
+ error = -EPERM;
+ goto out_unlock;
+ }
/* check the new inode size is valid before allocating */
if (!(mode & FALLOC_FL_KEEP_SIZE) &&
On Thu, Mar 03, 2011 at 09:42:27AM +0100, Marco Stornelli wrote:
> From: Marco Stornelli <[email protected]>
>
> All fs must check for the immutable flag in their fallocate callback.
> It's possible to have a race condition in this scenario: an application
> open a file in read/write and it does something, meanwhile root set the
> immutable flag on the file, the application at that point can call
> fallocate with success. Only Ocfs2 check for the immutable flag at the
> moment.
>
> Signed-off-by: Marco Stornelli <[email protected]>
> ---
> Patch is against 2.6.38-rc5
>
> ChangeLog
> v2: Added the check for append-only file for XFS
> v1: First draft
>
> --- linux-2.6.38-rc5-orig/fs/ext4/extents.c 2011-02-16 04:23:45.000000000 +0100
> +++ linux-2.6.38-rc5/fs/ext4/extents.c 2011-02-21 08:43:37.000000000 +0100
> @@ -3670,6 +3670,12 @@ long ext4_fallocate(struct file *file, i
> */
> credits = ext4_chunk_trans_blocks(inode, max_blocks);
> mutex_lock(&inode->i_mutex);
> +
> + if (IS_IMMUTABLE(inode)) {
> + mutex_unlock(&inode->i_mutex);
> + return -EPERM;
> + }
> +
> ret = inode_newsize_ok(inode, (len + offset));
> if (ret) {
> mutex_unlock(&inode->i_mutex);
> --- linux-2.6.38-rc5-orig/fs/btrfs/file.c 2011-02-16 04:23:45.000000000 +0100
> +++ linux-2.6.38-rc5/fs/btrfs/file.c 2011-02-21 08:55:58.000000000 +0100
> @@ -1289,6 +1289,12 @@ static long btrfs_fallocate(struct file
> btrfs_wait_ordered_range(inode, alloc_start, alloc_end - alloc_start);
>
> mutex_lock(&inode->i_mutex);
> +
> + if (IS_IMMUTABLE(inode)) {
> + ret = -EPERM;
> + goto out;
> + }
> +
> ret = inode_newsize_ok(inode, alloc_end);
> if (ret)
> goto out;
> --- linux-2.6.38-rc5-orig/fs/gfs2/file.c 2011-02-16 04:23:45.000000000 +0100
> +++ linux-2.6.38-rc5/fs/gfs2/file.c 2011-02-21 09:09:17.000000000 +0100
> @@ -797,6 +797,11 @@ static long gfs2_fallocate(struct file *
> if (unlikely(error))
> goto out_uninit;
>
> + if (IS_IMMUTABLE(inode)) {
> + error = -EPERM;
> + goto out_unlock;
> + }
> +
> if (!gfs2_write_alloc_required(ip, offset, len))
> goto out_unlock;
>
> --- ./linux-2.6.38-rc5/fs/xfs/linux-2.6/xfs_file.c 2011-02-16 04:23:45.000000000 +0100
> +++ ./linux-2.6.38-rc5/fs/xfs/linux-2.6/xfs_file.c 2011-03-03 09:25:32.000000000 +0100
> @@ -906,8 +906,18 @@ xfs_file_fallocate(
>
> xfs_ilock(ip, XFS_IOLOCK_EXCL);
>
> - if (mode & FALLOC_FL_PUNCH_HOLE)
> + if (mode & FALLOC_FL_PUNCH_HOLE) {
> cmd = XFS_IOC_UNRESVSP;
> + if (IS_APPEND(inode)) {
> + error = -EPERM;
> + goto out_unlock;
> + }
> + }
WTF? Why does append mode have any effect on whether we can punch
holes in a file or not? There's no justification for adding this in
the commit message. Why is it even in a patch that is for checking
immutable inodes? What is the point of adding it, when all that will
happen is people will switch to XFS_IOC_UNRESVSP which has never had
this limitation?
And this asks bigger questions - why would you allow preallocate
anywhere but at or beyond EOF on an append mode inode? You can only
append to the file, so if you're going to add limitations based on
the append flag, you need to think this through a bit more....
> +
> + if (IS_IMMUTABLE(inode)) {
> + error = -EPERM;
> + goto out_unlock;
> + }
Also, like Christoph said, these checks belong in the generic code,
not in every filesystem. The same checks have to be made for every
filesystem, so they should be done before calling out the
filesystems regardless of what functionality the filesystem actually
supports.
Cheers,
Dave.
--
Dave Chinner
[email protected]
Hi Dave,
Il 03/03/2011 22:39, Dave Chinner ha scritto:
> WTF? Why does append mode have any effect on whether we can punch
> holes in a file or not? There's no justification for adding this in
> the commit message. Why is it even in a patch that is for checking
> immutable inodes? What is the point of adding it, when all that will
> happen is people will switch to XFS_IOC_UNRESVSP which has never had
> this limitation?
So according to you, it's legal to do an "unreserve" operation on an
append-only file. It's not the same for me, but if the community said
that this is the right behavior then ok.
>
> And this asks bigger questions - why would you allow preallocate
> anywhere but at or beyond EOF on an append mode inode? You can only
> append to the file, so if you're going to add limitations based on
> the append flag, you need to think this through a bit more....
>
I don't understand this point. The theory of operation was:
1) we don't allow any operation (reserve/unreserve) on a immutable file;
2) we don't allow *unreserve* operation on an append-only file (this
check makes sense only for fs that support the unreserve operation).
>
> Also, like Christoph said, these checks belong in the generic code,
> not in every filesystem. The same checks have to be made for every
> filesystem, so they should be done before calling out the
> filesystems regardless of what functionality the filesystem actually
> supports.
>
This was related to the first point, if we remove it then it's ok to
check in a common code. Even if I think we should do the check under the
inode lock to avoid race between fallocate and setattr, isn't it?
Marco
Il 04/03/2011 09:17, Marco Stornelli ha scritto:
> Hi Dave,
>
> Il 03/03/2011 22:39, Dave Chinner ha scritto:
>> WTF? Why does append mode have any effect on whether we can punch
>> holes in a file or not? There's no justification for adding this in
>> the commit message. Why is it even in a patch that is for checking
>> immutable inodes? What is the point of adding it, when all that will
>> happen is people will switch to XFS_IOC_UNRESVSP which has never had
>> this limitation?
>
> So according to you, it's legal to do an "unreserve" operation on an
> append-only file. It's not the same for me, but if the community said
> that this is the right behavior then ok.
>
>>
>> And this asks bigger questions - why would you allow preallocate
>> anywhere but at or beyond EOF on an append mode inode? You can only
>> append to the file, so if you're going to add limitations based on
>> the append flag, you need to think this through a bit more....
>>
>
> I don't understand this point. The theory of operation was:
>
> 1) we don't allow any operation (reserve/unreserve) on a immutable file;
> 2) we don't allow *unreserve* operation on an append-only file (this
> check makes sense only for fs that support the unreserve operation).
>
>>
>> Also, like Christoph said, these checks belong in the generic code,
>> not in every filesystem. The same checks have to be made for every
>> filesystem, so they should be done before calling out the
>> filesystems regardless of what functionality the filesystem actually
>> supports.
>>
>
> This was related to the first point, if we remove it then it's ok to
> check in a common code. Even if I think we should do the check under the
> inode lock to avoid race between fallocate and setattr, isn't it?
>
Oops, I meant setflags in ioctl path, sorry. At this point I'm waiting
for response about how to manage the append flag and how to manage the
lock on the flags. Ted pointed out that a proper fix would be to avoid
the lock and use bit operation but it requires a deep modification on
several fs and it could be a separate patch and code review, so I think
we can choice to use lock/unlock in do_fallocate. I'll resend the patch.
Marco
On Fri, Mar 04, 2011 at 08:39:03AM +1100, Dave Chinner wrote:
> WTF? Why does append mode have any effect on whether we can punch
> holes in a file or not? There's no justification for adding this in
> the commit message. Why is it even in a patch that is for checking
> immutable inodes? What is the point of adding it, when all that will
> happen is people will switch to XFS_IOC_UNRESVSP which has never had
> this limitation?
xfs_ioc_space unconditionally rejects inodes with S_APPEND set for
all preallocation / hole punching ioctls. This might be overzealous for
preallocations not changing the size, or just extending i_size, but it's
IMHO entirely correct for hole punching.
2011/3/14 Christoph Hellwig <[email protected]>:
> On Fri, Mar 04, 2011 at 08:39:03AM +1100, Dave Chinner wrote:
>> WTF? ?Why does append mode have any effect on whether we can punch
>> holes in a file or not? There's no justification for adding this in
>> the commit message. Why is it even in a patch that is for checking
>> immutable inodes? What is the point of adding it, when all that will
>> happen is people will switch to XFS_IOC_UNRESVSP which has never had
>> this limitation?
>
> xfs_ioc_space unconditionally rejects inodes with S_APPEND set for
> all preallocation / hole punching ioctls. ?This might be overzealous for
> preallocations not changing the size, or just extending i_size, but it's
> IMHO entirely correct for hole punching.
>
xfs_ioc_space is in the ioctl path, but we are talking about the
fallocate path. Both of them calls the xfs_change_file_space, isnt'it?
However we are agree about hole punching, the patch is already in
Linus's git tree.
Marco