2012-01-03 01:28:47

by Djalal Harouni

[permalink] [raw]
Subject: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode


The EXT{3,4}_IOC_SETVERSION ioctl() updates the inode without i_mutex,
this can lead to a race with the other operations that update the same
inode.

Patch tested.

Signed-off-by: Djalal Harouni <[email protected]>
---
fs/ext3/ioctl.c | 6 +++++-
fs/ext4/ioctl.c | 6 +++++-
2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/ext3/ioctl.c b/fs/ext3/ioctl.c
index ba1b54e..e7b2ed9 100644
--- a/fs/ext3/ioctl.c
+++ b/fs/ext3/ioctl.c
@@ -134,10 +134,11 @@ flags_out:
goto setversion_out;
}

+ mutex_lock(&inode->i_mutex);
handle = ext3_journal_start(inode, 1);
if (IS_ERR(handle)) {
err = PTR_ERR(handle);
- goto setversion_out;
+ goto unlock_out;
}
err = ext3_reserve_inode_write(handle, inode, &iloc);
if (err == 0) {
@@ -146,6 +147,9 @@ flags_out:
err = ext3_mark_iloc_dirty(handle, inode, &iloc);
}
ext3_journal_stop(handle);
+
+unlock_out:
+ mutex_unlock(&inode->i_mutex);
setversion_out:
mnt_drop_write(filp->f_path.mnt);
return err;
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index a567968..46a8de6 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -158,10 +158,11 @@ flags_out:
goto setversion_out;
}

+ mutex_lock(&inode->i_mutex);
handle = ext4_journal_start(inode, 1);
if (IS_ERR(handle)) {
err = PTR_ERR(handle);
- goto setversion_out;
+ goto unlock_out;
}
err = ext4_reserve_inode_write(handle, inode, &iloc);
if (err == 0) {
@@ -170,6 +171,9 @@ flags_out:
err = ext4_mark_iloc_dirty(handle, inode, &iloc);
}
ext4_journal_stop(handle);
+
+unlock_out:
+ mutex_unlock(&inode->i_mutex);
setversion_out:
mnt_drop_write(filp->f_path.mnt);
return err;
--
1.7.1


2012-01-03 12:46:26

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

Hello,

On Tue 03-01-12 02:31:52, Djalal Harouni wrote:
>
> The EXT{3,4}_IOC_SETVERSION ioctl() updates the inode without i_mutex,
> this can lead to a race with the other operations that update the same
> inode.
>
> Patch tested.
Thanks for the patch but I don't quite understand the problem.
i_generation is set when:
a) inode is loaded from disk
b) inode is allocated
c) in SETVERSION ioctl

The only thing that can race here seems to be c) against c) and that is
racy with i_mutex as well. So what problems do you exactly observe without
the patch?

Honza
> Signed-off-by: Djalal Harouni <[email protected]>
> ---
> fs/ext3/ioctl.c | 6 +++++-
> fs/ext4/ioctl.c | 6 +++++-
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext3/ioctl.c b/fs/ext3/ioctl.c
> index ba1b54e..e7b2ed9 100644
> --- a/fs/ext3/ioctl.c
> +++ b/fs/ext3/ioctl.c
> @@ -134,10 +134,11 @@ flags_out:
> goto setversion_out;
> }
>
> + mutex_lock(&inode->i_mutex);
> handle = ext3_journal_start(inode, 1);
> if (IS_ERR(handle)) {
> err = PTR_ERR(handle);
> - goto setversion_out;
> + goto unlock_out;
> }
> err = ext3_reserve_inode_write(handle, inode, &iloc);
> if (err == 0) {
> @@ -146,6 +147,9 @@ flags_out:
> err = ext3_mark_iloc_dirty(handle, inode, &iloc);
> }
> ext3_journal_stop(handle);
> +
> +unlock_out:
> + mutex_unlock(&inode->i_mutex);
> setversion_out:
> mnt_drop_write(filp->f_path.mnt);
> return err;
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index a567968..46a8de6 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -158,10 +158,11 @@ flags_out:
> goto setversion_out;
> }
>
> + mutex_lock(&inode->i_mutex);
> handle = ext4_journal_start(inode, 1);
> if (IS_ERR(handle)) {
> err = PTR_ERR(handle);
> - goto setversion_out;
> + goto unlock_out;
> }
> err = ext4_reserve_inode_write(handle, inode, &iloc);
> if (err == 0) {
> @@ -170,6 +171,9 @@ flags_out:
> err = ext4_mark_iloc_dirty(handle, inode, &iloc);
> }
> ext4_journal_stop(handle);
> +
> +unlock_out:
> + mutex_unlock(&inode->i_mutex);
> setversion_out:
> mnt_drop_write(filp->f_path.mnt);
> return err;
> --
> 1.7.1
--
Jan Kara <[email protected]>
SUSE Labs, CR

2012-01-03 23:11:28

by Djalal Harouni

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On Tue, Jan 03, 2012 at 01:46:24PM +0100, Jan Kara wrote:
> Hello,
>
> On Tue 03-01-12 02:31:52, Djalal Harouni wrote:
> >
> > The EXT{3,4}_IOC_SETVERSION ioctl() updates the inode without i_mutex,
> > this can lead to a race with the other operations that update the same
> > inode.
> >
> > Patch tested.
> Thanks for the patch but I don't quite understand the problem.
> i_generation is set when:
> a) inode is loaded from disk
> b) inode is allocated
> c) in SETVERSION ioctl
>
> The only thing that can race here seems to be c) against c) and that is
> racy with i_mutex as well. So what problems do you exactly observe without
> the patch?
Right, but what about the related i_ctime change ? (i_ctime is updated in
other places...)

The i_ctime update must reflect the _appropriate_ inode modification
operation. This is why IMHO we should protect them to avoid a lost update.

BTW the i_generation which is used by NFS and fuse filesystems is updated
even if the inode is marked immutable, is this the intended behaviour?


> Honza
Thanks for your response.

--
tixxdz
http://opendz.org

2012-01-04 17:34:50

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On Wed 04-01-12 00:14:32, Djalal Harouni wrote:
> On Tue, Jan 03, 2012 at 01:46:24PM +0100, Jan Kara wrote:
> > On Tue 03-01-12 02:31:52, Djalal Harouni wrote:
> > >
> > > The EXT{3,4}_IOC_SETVERSION ioctl() updates the inode without i_mutex,
> > > this can lead to a race with the other operations that update the same
> > > inode.
> > >
> > > Patch tested.
> > Thanks for the patch but I don't quite understand the problem.
> > i_generation is set when:
> > a) inode is loaded from disk
> > b) inode is allocated
> > c) in SETVERSION ioctl
> >
> > The only thing that can race here seems to be c) against c) and that is
> > racy with i_mutex as well. So what problems do you exactly observe without
> > the patch?
> Right, but what about the related i_ctime change ? (i_ctime is updated in
> other places...)
>
> The i_ctime update must reflect the _appropriate_ inode modification
> operation. This is why IMHO we should protect them to avoid a lost update.
Yes, but realistically even if we race with someone else updating
i_ctime, the times will be rather similar so there's hardly a real
difference.

Anyway, using i_mutex is consistent with how we handle permission changes
or timestamp changes and the ioctl isn't performance critical so I'll take
your patch. I was just wondering whether you observed a real problem or
whether it's more or less a theoretical concern.

> BTW the i_generation which is used by NFS and fuse filesystems is updated
> even if the inode is marked immutable, is this the intended behaviour?
Well, I'd say it's unexpected that generation can be changed for
immutable inode so I'd be inclined to take a patch that would make ext3
refuse to do that. But it's a change in how the ioctl behaves so I'd like
to hear opinion of Ted or Andrew on this.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2012-01-04 17:46:11

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On Tue 03-01-12 02:31:52, Djalal Harouni wrote:
>
> The EXT{3,4}_IOC_SETVERSION ioctl() updates the inode without i_mutex,
> this can lead to a race with the other operations that update the same
> inode.
>
> Patch tested.
OK, so I've taken the patch into my tree, just updated the changelog
which result of our discussion in this thread. I also took the ext4 part
since there is no risk of conflict and the patch looks obvious.

Honza

> Signed-off-by: Djalal Harouni <[email protected]>
> ---
> fs/ext3/ioctl.c | 6 +++++-
> fs/ext4/ioctl.c | 6 +++++-
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext3/ioctl.c b/fs/ext3/ioctl.c
> index ba1b54e..e7b2ed9 100644
> --- a/fs/ext3/ioctl.c
> +++ b/fs/ext3/ioctl.c
> @@ -134,10 +134,11 @@ flags_out:
> goto setversion_out;
> }
>
> + mutex_lock(&inode->i_mutex);
> handle = ext3_journal_start(inode, 1);
> if (IS_ERR(handle)) {
> err = PTR_ERR(handle);
> - goto setversion_out;
> + goto unlock_out;
> }
> err = ext3_reserve_inode_write(handle, inode, &iloc);
> if (err == 0) {
> @@ -146,6 +147,9 @@ flags_out:
> err = ext3_mark_iloc_dirty(handle, inode, &iloc);
> }
> ext3_journal_stop(handle);
> +
> +unlock_out:
> + mutex_unlock(&inode->i_mutex);
> setversion_out:
> mnt_drop_write(filp->f_path.mnt);
> return err;
> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
> index a567968..46a8de6 100644
> --- a/fs/ext4/ioctl.c
> +++ b/fs/ext4/ioctl.c
> @@ -158,10 +158,11 @@ flags_out:
> goto setversion_out;
> }
>
> + mutex_lock(&inode->i_mutex);
> handle = ext4_journal_start(inode, 1);
> if (IS_ERR(handle)) {
> err = PTR_ERR(handle);
> - goto setversion_out;
> + goto unlock_out;
> }
> err = ext4_reserve_inode_write(handle, inode, &iloc);
> if (err == 0) {
> @@ -170,6 +171,9 @@ flags_out:
> err = ext4_mark_iloc_dirty(handle, inode, &iloc);
> }
> ext4_journal_stop(handle);
> +
> +unlock_out:
> + mutex_unlock(&inode->i_mutex);
> setversion_out:
> mnt_drop_write(filp->f_path.mnt);
> return err;
> --
> 1.7.1
--
Jan Kara <[email protected]>
SUSE Labs, CR

2012-01-04 23:15:07

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On 2012-01-04, at 10:46 AM, Jan Kara wrote:
> On Tue 03-01-12 02:31:52, Djalal Harouni wrote:
>>
>> The EXT{3,4}_IOC_SETVERSION ioctl() updates the inode without i_mutex,
>> this can lead to a race with the other operations that update the same
>> inode.
>>
>> Patch tested.
>
> OK, so I've taken the patch into my tree, just updated the changelog
> which result of our discussion in this thread. I also took the ext4 part
> since there is no risk of conflict and the patch looks obvious.

Actually, I'd like to hear more about whether this is a real problem, or
if it is just a theoretical problem found during code inspection or from
some static code analysis tool?

With the metadata checksum feature we were discussing using the inode
generation as part of the seed for the directory leaf block checksum, so
that it wasn't possible to incorrectly access stale directory blocks from
a previous incarnation of the same inode number.

We were discussing just disabling this ioctl on filesystems with metadata
checksums, and printing a deprecation warning for filesystems without that
feature enabled. I'm not aware of any real-world use for this ioctl, since
NFS cannot use it to reconstruct handles because there's no API to allocate
an inode with a specific number, so setting the generation is pointless.

This ioctl (despite its confusing name) is completely different from the
NFSv4 inode version, which is stored in i_version (and i_version_hi).

Cheers, Andreas

>> Signed-off-by: Djalal Harouni <[email protected]>
>> ---
>> fs/ext3/ioctl.c | 6 +++++-
>> fs/ext4/ioctl.c | 6 +++++-
>> 2 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ext3/ioctl.c b/fs/ext3/ioctl.c
>> index ba1b54e..e7b2ed9 100644
>> --- a/fs/ext3/ioctl.c
>> +++ b/fs/ext3/ioctl.c
>> @@ -134,10 +134,11 @@ flags_out:
>> goto setversion_out;
>> }
>>
>> + mutex_lock(&inode->i_mutex);
>> handle = ext3_journal_start(inode, 1);
>> if (IS_ERR(handle)) {
>> err = PTR_ERR(handle);
>> - goto setversion_out;
>> + goto unlock_out;
>> }
>> err = ext3_reserve_inode_write(handle, inode, &iloc);
>> if (err == 0) {
>> @@ -146,6 +147,9 @@ flags_out:
>> err = ext3_mark_iloc_dirty(handle, inode, &iloc);
>> }
>> ext3_journal_stop(handle);
>> +
>> +unlock_out:
>> + mutex_unlock(&inode->i_mutex);
>> setversion_out:
>> mnt_drop_write(filp->f_path.mnt);
>> return err;
>> diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
>> index a567968..46a8de6 100644
>> --- a/fs/ext4/ioctl.c
>> +++ b/fs/ext4/ioctl.c
>> @@ -158,10 +158,11 @@ flags_out:
>> goto setversion_out;
>> }
>>
>> + mutex_lock(&inode->i_mutex);
>> handle = ext4_journal_start(inode, 1);
>> if (IS_ERR(handle)) {
>> err = PTR_ERR(handle);
>> - goto setversion_out;
>> + goto unlock_out;
>> }
>> err = ext4_reserve_inode_write(handle, inode, &iloc);
>> if (err == 0) {
>> @@ -170,6 +171,9 @@ flags_out:
>> err = ext4_mark_iloc_dirty(handle, inode, &iloc);
>> }
>> ext4_journal_stop(handle);
>> +
>> +unlock_out:
>> + mutex_unlock(&inode->i_mutex);
>> setversion_out:
>> mnt_drop_write(filp->f_path.mnt);
>> return err;
>> --
>> 1.7.1
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR


Cheers, Andreas






2012-01-04 23:32:56

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On Wed 04-01-12 16:15:04, Andreas Dilger wrote:
> On 2012-01-04, at 10:46 AM, Jan Kara wrote:
> > On Tue 03-01-12 02:31:52, Djalal Harouni wrote:
> >>
> >> The EXT{3,4}_IOC_SETVERSION ioctl() updates the inode without i_mutex,
> >> this can lead to a race with the other operations that update the same
> >> inode.
> >>
> >> Patch tested.
> >
> > OK, so I've taken the patch into my tree, just updated the changelog
> > which result of our discussion in this thread. I also took the ext4 part
> > since there is no risk of conflict and the patch looks obvious.
>
> Actually, I'd like to hear more about whether this is a real problem, or
> if it is just a theoretical problem found during code inspection or from
> some static code analysis tool?
As far as I understood that was just a theoretical issue and I applied
the patch just on the grounds that it is more consistent to use i_mutex for
i_generation changes.

> With the metadata checksum feature we were discussing using the inode
> generation as part of the seed for the directory leaf block checksum, so
> that it wasn't possible to incorrectly access stale directory blocks from
> a previous incarnation of the same inode number.
>
> We were discussing just disabling this ioctl on filesystems with metadata
> checksums, and printing a deprecation warning for filesystems without that
> feature enabled. I'm not aware of any real-world use for this ioctl, since
> NFS cannot use it to reconstruct handles because there's no API to allocate
> an inode with a specific number, so setting the generation is pointless.
OK, I didn't know this. I'm fine with deprecating the ioctl if it's
useless but since that's going to take a while I think the cleanup still
makes some sense.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2012-01-04 23:57:01

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On 2012-01-04, at 4:32 PM, Jan Kara wrote:
> On Wed 04-01-12 16:15:04, Andreas Dilger wrote:
>> On 2012-01-04, at 10:46 AM, Jan Kara wrote:
>>> On Tue 03-01-12 02:31:52, Djalal Harouni wrote:
>>>>
>>>> The EXT{3,4}_IOC_SETVERSION ioctl() updates the inode without i_mutex,
>>>> this can lead to a race with the other operations that update the same
>>>> inode.
>>>>
>>>> Patch tested.
>>>
>>> OK, so I've taken the patch into my tree, just updated the changelog
>>> which result of our discussion in this thread. I also took the ext4 part
>>> since there is no risk of conflict and the patch looks obvious.
>>
>> Actually, I'd like to hear more about whether this is a real problem, or
>> if it is just a theoretical problem found during code inspection or from
>> some static code analysis tool?
>
> As far as I understood that was just a theoretical issue and I applied
> the patch just on the grounds that it is more consistent to use i_mutex for
> i_generation changes.
>
>> With the metadata checksum feature we were discussing using the inode
>> generation as part of the seed for the directory leaf block checksum, so
>> that it wasn't possible to incorrectly access stale directory blocks from
>> a previous incarnation of the same inode number.
>>
>> We were discussing just disabling this ioctl on filesystems with metadata
>> checksums, and printing a deprecation warning for filesystems without that
>> feature enabled. I'm not aware of any real-world use for this ioctl, since
>> NFS cannot use it to reconstruct handles because there's no API to allocate
>> an inode with a specific number, so setting the generation is pointless.
>
> OK, I didn't know this. I'm fine with deprecating the ioctl if it's
> useless but since that's going to take a while I think the cleanup still
> makes some sense.

I'm not against landing the patch, and I agree that there is no question
about the performance impact of making this ioctl safe. My real question
was whether there was a real-world use for this ioctl which might prevent
it from being deprecated.


Cheers, Andreas






2012-01-05 00:37:06

by Djalal Harouni

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On Thu, Jan 05, 2012 at 12:32:54AM +0100, Jan Kara wrote:
> On Wed 04-01-12 16:15:04, Andreas Dilger wrote:
> > On 2012-01-04, at 10:46 AM, Jan Kara wrote:
> > > On Tue 03-01-12 02:31:52, Djalal Harouni wrote:
> > >>
> > >> The EXT{3,4}_IOC_SETVERSION ioctl() updates the inode without i_mutex,
> > >> this can lead to a race with the other operations that update the same
> > >> inode.
> > >>
> > >> Patch tested.
> > >
> > > OK, so I've taken the patch into my tree, just updated the changelog
> > > which result of our discussion in this thread. I also took the ext4 part
> > > since there is no risk of conflict and the patch looks obvious.
> >
> > Actually, I'd like to hear more about whether this is a real problem, or
> > if it is just a theoretical problem found during code inspection or from
> > some static code analysis tool?
> As far as I understood that was just a theoretical issue and I applied
> the patch just on the grounds that it is more consistent to use i_mutex for
> i_generation changes.
This was found using a static code analysis tool (currently a PoC) which
is a part of a research project at our university.

And later, code inspection confirms that i_ctime updates can be disturbed.

I should have specified this. Sorry.

> > With the metadata checksum feature we were discussing using the inode
> > generation as part of the seed for the directory leaf block checksum, so
> > that it wasn't possible to incorrectly access stale directory blocks from
> > a previous incarnation of the same inode number.
> >
> > We were discussing just disabling this ioctl on filesystems with metadata
> > checksums, and printing a deprecation warning for filesystems without that
> > feature enabled. I'm not aware of any real-world use for this ioctl, since
> > NFS cannot use it to reconstruct handles because there's no API to allocate
> > an inode with a specific number, so setting the generation is pointless.
> OK, I didn't know this. I'm fine with deprecating the ioctl if it's
> useless but since that's going to take a while I think the cleanup still
> makes some sense.
Actually I've grepped this ioctl but did not found use cases, but as
ext{3,2} also support it, I did not say anything (this is old, there is
even the EXT4_IOC_SETVERSION_OLD ioctl ?). I don't know if this ioctl is
used or not.

Only the reiserfs and ext{2,3,4} filesystems support this ioctl. The reiserfs
do not use mutexes at all, even in the REISERFS_IOC_SETFLAGS ioctl which will
test and set _all_ the possible values of the i_flags field.
Perhaps I should also send a patch for this ?

And perhaps ext2 should also be updated.

> Honza
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR
Thanks for the feedback.

--
tixxdz
http://opendz.org

2012-01-05 11:42:10

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On Thu 05-01-12 01:40:09, Djalal Harouni wrote:
> On Thu, Jan 05, 2012 at 12:32:54AM +0100, Jan Kara wrote:
> > > With the metadata checksum feature we were discussing using the inode
> > > generation as part of the seed for the directory leaf block checksum, so
> > > that it wasn't possible to incorrectly access stale directory blocks from
> > > a previous incarnation of the same inode number.
> > >
> > > We were discussing just disabling this ioctl on filesystems with metadata
> > > checksums, and printing a deprecation warning for filesystems without that
> > > feature enabled. I'm not aware of any real-world use for this ioctl, since
> > > NFS cannot use it to reconstruct handles because there's no API to allocate
> > > an inode with a specific number, so setting the generation is pointless.
> > OK, I didn't know this. I'm fine with deprecating the ioctl if it's
> > useless but since that's going to take a while I think the cleanup still
> > makes some sense.
> Actually I've grepped this ioctl but did not found use cases, but as
> ext{3,2} also support it, I did not say anything (this is old, there is
> even the EXT4_IOC_SETVERSION_OLD ioctl ?). I don't know if this ioctl is
> used or not.
>
> Only the reiserfs and ext{2,3,4} filesystems support this ioctl. The reiserfs
> do not use mutexes at all, even in the REISERFS_IOC_SETFLAGS ioctl which will
> test and set _all_ the possible values of the i_flags field.
> Perhaps I should also send a patch for this ?
Yes, possibly reiserfs should use i_mutex for that ioctl.

> And perhaps ext2 should also be updated.
Sure. Send a patch my way when you have it.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2012-01-06 00:57:08

by Djalal Harouni

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On Thu, Jan 05, 2012 at 12:42:05PM +0100, Jan Kara wrote:
> On Thu 05-01-12 01:40:09, Djalal Harouni wrote:
> > Only the reiserfs and ext{2,3,4} filesystems support this ioctl. The reiserfs
> > do not use mutexes at all, even in the REISERFS_IOC_SETFLAGS ioctl which will
> > test and set _all_ the possible values of the i_flags field.
> > Perhaps I should also send a patch for this ?
> Yes, possibly reiserfs should use i_mutex for that ioctl.
For the reiserfs I've missed some locks.

It seems that reiserfs uses its own 'reiserfs_sb_info->lock' (reiserfs
super block data) to serialize writers in all the ioctl, even the GET
(readers) ioctl operations. But I also know that the VFS layer uses i_mutex
to protect inode changes, so ? I'm not sure, If I can test it I'll send a
patch.

> > And perhaps ext2 should also be updated.
> Sure. Send a patch my way when you have it.
Here's the tested ext2 patch, thanks.

--------

From: Djalal Harouni <[email protected]>

ext2: protect inode changes in the SETVERSION and SETFLAGS ioctls

Unlock mutex after i_flags and i_ctime updates in the EXT2_IOC_SETFLAGS
ioctl.

Use i_mutex in the EXT2_IOC_SETVERSION ioctl to protect i_ctime and
i_generation updates and make the ioctl consistent since i_mutex is
also used in other places to protect timestamps and inode changes.

Cc: Andreas Dilger <[email protected]>
Cc: Jan Kara <[email protected]>
Signed-off-by: Djalal Harouni <[email protected]>
---
fs/ext2/ioctl.c | 22 ++++++++++++++++------
1 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/fs/ext2/ioctl.c b/fs/ext2/ioctl.c
index f81e250..b7f931f 100644
--- a/fs/ext2/ioctl.c
+++ b/fs/ext2/ioctl.c
@@ -77,10 +77,11 @@ long ext2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
flags = flags & EXT2_FL_USER_MODIFIABLE;
flags |= oldflags & ~EXT2_FL_USER_MODIFIABLE;
ei->i_flags = flags;
- mutex_unlock(&inode->i_mutex);

ext2_set_inode_flags(inode);
inode->i_ctime = CURRENT_TIME_SEC;
+ mutex_unlock(&inode->i_mutex);
+
mark_inode_dirty(inode);
setflags_out:
mnt_drop_write(filp->f_path.mnt);
@@ -88,20 +89,29 @@ setflags_out:
}
case EXT2_IOC_GETVERSION:
return put_user(inode->i_generation, (int __user *) arg);
- case EXT2_IOC_SETVERSION:
+ case EXT2_IOC_SETVERSION: {
+ __u32 generation;
+
if (!inode_owner_or_capable(inode))
return -EPERM;
ret = mnt_want_write(filp->f_path.mnt);
if (ret)
return ret;
- if (get_user(inode->i_generation, (int __user *) arg)) {
+ if (get_user(generation, (int __user *) arg)) {
ret = -EFAULT;
- } else {
- inode->i_ctime = CURRENT_TIME_SEC;
- mark_inode_dirty(inode);
+ goto setversion_out;
}
+
+ mutex_lock(&inode->i_mutex);
+ inode->i_ctime = CURRENT_TIME_SEC;
+ inode->i_generation = generation;
+ mutex_unlock(&inode->i_mutex);
+
+ mark_inode_dirty(inode);
+setversion_out:
mnt_drop_write(filp->f_path.mnt);
return ret;
+ }
case EXT2_IOC_GETRSVSZ:
if (test_opt(inode->i_sb, RESERVATION)
&& S_ISREG(inode->i_mode)
--
1.7.1

2012-01-09 15:03:40

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] fs/ext{3,4}: fix potential race when setversion ioctl updates inode

On Fri 06-01-12 02:00:11, Djalal Harouni wrote:
> On Thu, Jan 05, 2012 at 12:42:05PM +0100, Jan Kara wrote:
> > On Thu 05-01-12 01:40:09, Djalal Harouni wrote:
> > > Only the reiserfs and ext{2,3,4} filesystems support this ioctl. The reiserfs
> > > do not use mutexes at all, even in the REISERFS_IOC_SETFLAGS ioctl which will
> > > test and set _all_ the possible values of the i_flags field.
> > > Perhaps I should also send a patch for this ?
> > Yes, possibly reiserfs should use i_mutex for that ioctl.
> For the reiserfs I've missed some locks.
>
> It seems that reiserfs uses its own 'reiserfs_sb_info->lock' (reiserfs
> super block data) to serialize writers in all the ioctl, even the GET
> (readers) ioctl operations. But I also know that the VFS layer uses i_mutex
> to protect inode changes, so ? I'm not sure, If I can test it I'll send a
> patch.
>
> > > And perhaps ext2 should also be updated.
> > Sure. Send a patch my way when you have it.
> Here's the tested ext2 patch, thanks.
Thanks I've taken it into my tree.

Honza
> --------
>
> From: Djalal Harouni <[email protected]>
>
> ext2: protect inode changes in the SETVERSION and SETFLAGS ioctls
>
> Unlock mutex after i_flags and i_ctime updates in the EXT2_IOC_SETFLAGS
> ioctl.
>
> Use i_mutex in the EXT2_IOC_SETVERSION ioctl to protect i_ctime and
> i_generation updates and make the ioctl consistent since i_mutex is
> also used in other places to protect timestamps and inode changes.
>
> Cc: Andreas Dilger <[email protected]>
> Cc: Jan Kara <[email protected]>
> Signed-off-by: Djalal Harouni <[email protected]>
> ---
> fs/ext2/ioctl.c | 22 ++++++++++++++++------
> 1 files changed, 16 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ext2/ioctl.c b/fs/ext2/ioctl.c
> index f81e250..b7f931f 100644
> --- a/fs/ext2/ioctl.c
> +++ b/fs/ext2/ioctl.c
> @@ -77,10 +77,11 @@ long ext2_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> flags = flags & EXT2_FL_USER_MODIFIABLE;
> flags |= oldflags & ~EXT2_FL_USER_MODIFIABLE;
> ei->i_flags = flags;
> - mutex_unlock(&inode->i_mutex);
>
> ext2_set_inode_flags(inode);
> inode->i_ctime = CURRENT_TIME_SEC;
> + mutex_unlock(&inode->i_mutex);
> +
> mark_inode_dirty(inode);
> setflags_out:
> mnt_drop_write(filp->f_path.mnt);
> @@ -88,20 +89,29 @@ setflags_out:
> }
> case EXT2_IOC_GETVERSION:
> return put_user(inode->i_generation, (int __user *) arg);
> - case EXT2_IOC_SETVERSION:
> + case EXT2_IOC_SETVERSION: {
> + __u32 generation;
> +
> if (!inode_owner_or_capable(inode))
> return -EPERM;
> ret = mnt_want_write(filp->f_path.mnt);
> if (ret)
> return ret;
> - if (get_user(inode->i_generation, (int __user *) arg)) {
> + if (get_user(generation, (int __user *) arg)) {
> ret = -EFAULT;
> - } else {
> - inode->i_ctime = CURRENT_TIME_SEC;
> - mark_inode_dirty(inode);
> + goto setversion_out;
> }
> +
> + mutex_lock(&inode->i_mutex);
> + inode->i_ctime = CURRENT_TIME_SEC;
> + inode->i_generation = generation;
> + mutex_unlock(&inode->i_mutex);
> +
> + mark_inode_dirty(inode);
> +setversion_out:
> mnt_drop_write(filp->f_path.mnt);
> return ret;
> + }
> case EXT2_IOC_GETRSVSZ:
> if (test_opt(inode->i_sb, RESERVATION)
> && S_ISREG(inode->i_mode)
> --
> 1.7.1
--
Jan Kara <[email protected]>
SUSE Labs, CR