2014-09-05 01:59:41

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix deadlock of i_data_sem in ext4_mark_inode_dirty()

On Thu, Sep 04, 2014 at 04:49:58PM +0800, Li Xi wrote:
> There are multiple places where ext4_mark_inode_dirty() is called holding
> write lock of EXT4_I(inode)->i_data_sem. However, if
> ext4_mark_inode_dirty() needs to expand inode size, this will cause
> deadlock when ext4_xattr_block_set() tries to get read lock of
> EXT4_I(inode)->i_data_sem.

This was with inline data enabled, right?

The problem with your change is that the reason why the locking is the
way it is was to fix a bug which Jan Kara identified in commit
90e775b71ac4e68: "ext4: fix lost truncate due to race with writeback".

ext4: fix lost truncate due to race with writeback

The following race can lead to a loss of i_disksize update from truncate
thus resulting in a wrong inode size if the inode size isn't updated
again before inode is reclaimed:

ext4_setattr() mpage_map_and_submit_extent()
EXT4_I(inode)->i_disksize = attr->ia_size;
... ...
disksize = ((loff_t)mpd->first_page) << PAGE_CACHE_SHIFT
/* False because i_size isn't
* updated yet */
if (disksize > i_size_read(inode))
/* True, because i_disksize is
* already truncated */
if (disksize > EXT4_I(inode)->i_disksize)
/* Overwrite i_disksize
* update from truncate */
ext4_update_i_disksize()
i_size_write(inode, attr->ia_size);

For other places updating i_disksize such race cannot happen because
i_mutex prevents these races. Writeback is the only place where we do
not hold i_mutex and we cannot grab it there because of lock ordering.

We fix the race by doing both i_disksize and i_size update in truncate
atomically under i_data_sem and in mpage_map_and_submit_extent() we move
the check against i_size under i_data_sem as well.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Cc: [email protected]

So I think we need to find another way to fix this problem. There are
a limited number of places before we call ext4_mark_inode_dirty()
where i_size will grow such that the inline data code might need to
move the data out from i_blocks[].

It might make more sense to have a helper function which checks to see
if this condition holds, and do the converation away from using
inline_data for that inode *before* we call ext4_mark_inode_dirty().

Does that make sense to you?

Regards,

- Ted


2014-09-05 02:29:16

by Li Xi

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix deadlock of i_data_sem in ext4_mark_inode_dirty()

On Fri, Sep 5, 2014 at 9:59 AM, Theodore Ts'o <[email protected]> wrote:
> On Thu, Sep 04, 2014 at 04:49:58PM +0800, Li Xi wrote:
>> There are multiple places where ext4_mark_inode_dirty() is called holding
>> write lock of EXT4_I(inode)->i_data_sem. However, if
>> ext4_mark_inode_dirty() needs to expand inode size, this will cause
>> deadlock when ext4_xattr_block_set() tries to get read lock of
>> EXT4_I(inode)->i_data_sem.
>
> This was with inline data enabled, right?
I hit this problem when starting a kernel with project quota support for ext4.
The ext4 file system was not formated with project quota feature so it tried
to extend the space for project ID. And this problem happened every time
when the kernel was rebooted. Inline data was not enable on that file
system. I am not sure whether this problem will happen under other
circumstances. :)
>
>
> So I think we need to find another way to fix this problem. There are
> a limited number of places before we call ext4_mark_inode_dirty()
> where i_size will grow such that the inline data code might need to
> move the data out from i_blocks[].
>
> It might make more sense to have a helper function which checks to see
> if this condition holds, and do the converation away from using
> inline_data for that inode *before* we call ext4_mark_inode_dirty().
>
> Does that make sense to you?
Sure, I agree there should be other better solution for this problem.

2014-09-05 03:52:56

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix deadlock of i_data_sem in ext4_mark_inode_dirty()

On Fri, Sep 05, 2014 at 10:29:16AM +0800, Li Xi wrote:
> On Fri, Sep 5, 2014 at 9:59 AM, Theodore Ts'o <[email protected]> wrote:
> > On Thu, Sep 04, 2014 at 04:49:58PM +0800, Li Xi wrote:
> >> There are multiple places where ext4_mark_inode_dirty() is called holding
> >> write lock of EXT4_I(inode)->i_data_sem. However, if
> >> ext4_mark_inode_dirty() needs to expand inode size, this will cause
> >> deadlock when ext4_xattr_block_set() tries to get read lock of
> >> EXT4_I(inode)->i_data_sem.
> >
> > This was with inline data enabled, right?
> I hit this problem when starting a kernel with project quota support for ext4.
> The ext4 file system was not formated with project quota feature so it tried
> to extend the space for project ID. And this problem happened every time
> when the kernel was rebooted. Inline data was not enable on that file
> system. I am not sure whether this problem will happen under other
> circumstances. :)

I'm not understanding why expanding the inode size would result in
needing to call ext4_xattr_block_set. Was that becuse you were
storing the project quota in the xattr? I'm just trying to understand
the context.

Please also note that a recent set of patches (sent to the ext4 list
and in the ext4 git tree) has removed the need for taking i_data_sem
in xattr.c:

http://patchwork.ozlabs.org/patch/385347
http://patchwork.ozlabs.org/patch/385348
http://patchwork.ozlabs.org/patch/385346

(It's the 2/3 patch that removes taking i_data_sem read lock in xattr.c.)

Regards,

- Ted

2014-09-05 04:44:50

by Li Xi

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix deadlock of i_data_sem in ext4_mark_inode_dirty()

On Fri, Sep 5, 2014 at 11:30 AM, Theodore Ts'o <[email protected]> wrote:
> On Fri, Sep 05, 2014 at 10:29:16AM +0800, Li Xi wrote:
>> On Fri, Sep 5, 2014 at 9:59 AM, Theodore Ts'o <[email protected]> wrote:
>> > On Thu, Sep 04, 2014 at 04:49:58PM +0800, Li Xi wrote:
>> >> There are multiple places where ext4_mark_inode_dirty() is called holding
>> >> write lock of EXT4_I(inode)->i_data_sem. However, if
>> >> ext4_mark_inode_dirty() needs to expand inode size, this will cause
>> >> deadlock when ext4_xattr_block_set() tries to get read lock of
>> >> EXT4_I(inode)->i_data_sem.
>> >
>> > This was with inline data enabled, right?
>> I hit this problem when starting a kernel with project quota support for ext4.
>> The ext4 file system was not formated with project quota feature so it tried
>> to extend the space for project ID. And this problem happened every time
>> when the kernel was rebooted. Inline data was not enable on that file
>> system. I am not sure whether this problem will happen under other
>> circumstances. :)
>
> I'm not understanding why expanding the inode size would result in
> needing to call ext4_xattr_block_set. Was that becuse you were
> storing the project quota in the xattr? I'm just trying to understand
> the context.
Yeah, you are right. The problem happened when the project ID was
saved as xattr. And I can't reproduce the same problem with the new
patches.
>
> Please also note that a recent set of patches (sent to the ext4 list
> and in the ext4 git tree) has removed the need for taking i_data_sem
> in xattr.c:
>
> http://patchwork.ozlabs.org/patch/385347
> http://patchwork.ozlabs.org/patch/385348
> http://patchwork.ozlabs.org/patch/385346
>
> (It's the 2/3 patch that removes taking i_data_sem read lock in xattr.c.)
>
Great! Please ignore this patch since it won't be a problem
any more.

Regards,

- Li Xi