2011-09-13 18:33:36

by Allison Henderson

[permalink] [raw]
Subject: i_mutex questions

Hi All,

I have been trying to find a way to synchronize punch hole with read and
write operations with out the use of i_mutex. The concern is that after
punch hole has released the pages inside the hole, another process may
remap the page to a block before punch has taken i_data_sem. I think
putting i_mutex around the punch hole operation would fix this, but
since we are trying to avoid further improper use of i_mutex, I am
trying to avoid that solution.

I cannot use i_data_sem to protect the pages because it seems most of
the code has already established a locking order of pages first, then
i_data_sem. So moving i_data_sem up tends to cause a lot of dead locks.
I'm thinking that there probably needs to be a another mutex involved
some where, but I wasnt sure if some one is already working on the idea
of introducing a replacement for i_mutex. So I just wanted to know if
there are any plans already in motion for this, or if any one else could
suggest some ideas for the punch hole issue. Thx all!

Allison Henderson



2011-09-13 19:02:17

by Joel Becker

[permalink] [raw]
Subject: Re: i_mutex questions

On Tue, Sep 13, 2011 at 11:33:29AM -0700, Allison Henderson wrote:
> Hi All,
>
> I have been trying to find a way to synchronize punch hole with read
> and write operations with out the use of i_mutex. The concern is
> that after punch hole has released the pages inside the hole,
> another process may remap the page to a block before punch has taken
> i_data_sem. I think putting i_mutex around the punch hole operation
> would fix this, but since we are trying to avoid further improper
> use of i_mutex, I am trying to avoid that solution.

Hey Allison,
Actually, i_mutex is the normal way to handle this. ocfs2 takes
i_mutex down under its ->fallocate(). Truncate is in the same boat,
which is why do_truncate() takes i_mutex before calling notify_change().
The read-write paths grab i_mutex for buffered operation. They
don't for O_DIRECT, which doesn't map to the pagecache. This is where
i_data_sem should speed things up.

Joel


--

"Anything that is too stupid to be spoken is sung."
- Voltaire

http://www.jlbec.org/
[email protected]

2011-09-13 22:11:41

by Allison Henderson

[permalink] [raw]
Subject: Re: i_mutex questions

On 09/13/2011 12:02 PM, Joel Becker wrote:
> On Tue, Sep 13, 2011 at 11:33:29AM -0700, Allison Henderson wrote:
>> Hi All,
>>
>> I have been trying to find a way to synchronize punch hole with read
>> and write operations with out the use of i_mutex. The concern is
>> that after punch hole has released the pages inside the hole,
>> another process may remap the page to a block before punch has taken
>> i_data_sem. I think putting i_mutex around the punch hole operation
>> would fix this, but since we are trying to avoid further improper
>> use of i_mutex, I am trying to avoid that solution.
>
> Hey Allison,
> Actually, i_mutex is the normal way to handle this. ocfs2 takes
> i_mutex down under its ->fallocate(). Truncate is in the same boat,
> which is why do_truncate() takes i_mutex before calling notify_change().
> The read-write paths grab i_mutex for buffered operation. They
> don't for O_DIRECT, which doesn't map to the pagecache. This is where
> i_data_sem should speed things up.
>
> Joel
>
Hi Joel,

Well, I actually already had a patch that was trying to use i_mutex to
solve this ([PATCH 4/6 v7] ext4: Lock i_mutex for punch hole). But we
decided not to apply it because of plans to reduce the usage of i_mutex
in the ext4 code. So I've been trying to figure out a different way to
solve this, but so far I haven't had a whole lot of luck finding a
solution that doesn't involve introducing a new locking mechanism. So I
wanted to check back here for more details on what the plan for i_mutex
is so I dont conflict with anything that might already be going on. :)

Ted, would you be able to give us some more details on this topic? Thx!

Allison Henderson

2011-09-14 00:06:10

by Joel Becker

[permalink] [raw]
Subject: Re: i_mutex questions

On Tue, Sep 13, 2011 at 03:10:56PM -0700, Allison Henderson wrote:
> Well, I actually already had a patch that was trying to use i_mutex
> to solve this ([PATCH 4/6 v7] ext4: Lock i_mutex for punch hole).
> But we decided not to apply it because of plans to reduce the usage
> of i_mutex in the ext4 code. So I've been trying to figure out a
> different way to solve this, but so far I haven't had a whole lot of
> luck finding a solution that doesn't involve introducing a new
> locking mechanism. So I wanted to check back here for more details
> on what the plan for i_mutex is so I dont conflict with anything
> that might already be going on. :)

Sure ;-) If you find another mechanism that reduces contention
but still plays well with read/write et al, please let us all know.
Getting i_mutex out of read/write would be interesting.

Joel

--

"A narcissist is someone better looking than you are."
- Gore Vidal

http://www.jlbec.org/
[email protected]

2011-09-14 01:29:28

by Yongqiang Yang

[permalink] [raw]
Subject: Re: i_mutex questions

On Wed, Sep 14, 2011 at 2:33 AM, Allison Henderson
<[email protected]> wrote:
> Hi All,
>
> I have been trying to find a way to synchronize punch hole with read and
> write operations with out the use of i_mutex. ?The concern is that after
> punch hole has released the pages inside the hole, another process may remap
> the page to a block before punch has taken i_data_sem. ?I think putting
> i_mutex around the punch hole operation would fix this, but since we are
> trying to avoid further improper use of i_mutex, I am trying to avoid that
> solution.
>
> I cannot use i_data_sem to protect the pages because it seems most of the
> code has already established a locking order of pages first, then
> i_data_sem. ?So moving i_data_sem up tends to cause a lot of dead locks.
> ?I'm thinking that there probably needs to be a another mutex involved some
> where, but I wasnt sure if some one is already working on the idea of
> introducing a replacement for i_mutex. ?So I just wanted to know if there
> are any plans already in motion for this, or if any one else could suggest
> some ideas for the punch hole issue. ?Thx all!

HI,

Lukas sent out a patch ([PATCH] ext4: Make reads/writes atomic with
i_rwlock semaphore) which collected some feedbacks suggesting using
extent lock instead of a read-write semaphore. If there is extent
lock implementation in ext4, then fallocate can use it, maybe
dioread-nolock can use it as well, e.g. locking a range and unlocking
the range until the extent is converted from unwritten to init.

Yongqiang.


>
> Allison Henderson
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>



--
Best Wishes
Yongqiang Yang

2011-09-14 04:23:55

by Andreas Dilger

[permalink] [raw]
Subject: Re: i_mutex questions

On 2011-09-13, at 7:29 PM, Yongqiang Yang wrote:
> On Wed, Sep 14, 2011 at 2:33 AM, Allison Henderson
> <[email protected]> wrote:
>> Hi All,
>>
>> I have been trying to find a way to synchronize punch hole with read and
>> write operations with out the use of i_mutex. The concern is that after
>> punch hole has released the pages inside the hole, another process may remap
>> the page to a block before punch has taken i_data_sem. I think putting
>> i_mutex around the punch hole operation would fix this, but since we are
>> trying to avoid further improper use of i_mutex, I am trying to avoid that
>> solution.
>>
>> I cannot use i_data_sem to protect the pages because it seems most of the
>> code has already established a locking order of pages first, then
>> i_data_sem. So moving i_data_sem up tends to cause a lot of dead locks.
>> I'm thinking that there probably needs to be a another mutex involved some
>> where, but I wasnt sure if some one is already working on the idea of
>> introducing a replacement for i_mutex. So I just wanted to know if there
>> are any plans already in motion for this, or if any one else could suggest
>> some ideas for the punch hole issue. Thx all!
>
> Lukas sent out a patch ([PATCH] ext4: Make reads/writes atomic with
> i_rwlock semaphore) which collected some feedbacks suggesting using
> extent lock instead of a read-write semaphore. If there is extent
> lock implementation in ext4, then fallocate can use it, maybe
> dioread-nolock can use it as well, e.g. locking a range and unlocking
> the range until the extent is converted from unwritten to init.

We have a prototype patch for extent locking for ext4. We are planning to
use this for parallel locking of htree directories, but it could potentially
be modified to for extent locking of files.

The current patch is below, but it hasn't gone through a lot of testing:

http://review.whamcloud.com/#patch,sidebyside,375,2,ldiskfs/kernel_patches/patches/ext4_pdirop-rhel6.patch

Cheers, Andreas






2011-09-14 18:36:10

by Allison Henderson

[permalink] [raw]
Subject: Re: i_mutex questions

On 09/13/2011 09:23 PM, Andreas Dilger wrote:
> On 2011-09-13, at 7:29 PM, Yongqiang Yang wrote:
>> On Wed, Sep 14, 2011 at 2:33 AM, Allison Henderson
>> <[email protected]> wrote:
>>> Hi All,
>>>
>>> I have been trying to find a way to synchronize punch hole with read and
>>> write operations with out the use of i_mutex. The concern is that after
>>> punch hole has released the pages inside the hole, another process may remap
>>> the page to a block before punch has taken i_data_sem. I think putting
>>> i_mutex around the punch hole operation would fix this, but since we are
>>> trying to avoid further improper use of i_mutex, I am trying to avoid that
>>> solution.
>>>
>>> I cannot use i_data_sem to protect the pages because it seems most of the
>>> code has already established a locking order of pages first, then
>>> i_data_sem. So moving i_data_sem up tends to cause a lot of dead locks.
>>> I'm thinking that there probably needs to be a another mutex involved some
>>> where, but I wasnt sure if some one is already working on the idea of
>>> introducing a replacement for i_mutex. So I just wanted to know if there
>>> are any plans already in motion for this, or if any one else could suggest
>>> some ideas for the punch hole issue. Thx all!
>>
>> Lukas sent out a patch ([PATCH] ext4: Make reads/writes atomic with
>> i_rwlock semaphore) which collected some feedbacks suggesting using
>> extent lock instead of a read-write semaphore. If there is extent
>> lock implementation in ext4, then fallocate can use it, maybe
>> dioread-nolock can use it as well, e.g. locking a range and unlocking
>> the range until the extent is converted from unwritten to init.
>
> We have a prototype patch for extent locking for ext4. We are planning to
> use this for parallel locking of htree directories, but it could potentially
> be modified to for extent locking of files.
>
> The current patch is below, but it hasn't gone through a lot of testing:
>
> http://review.whamcloud.com/#patch,sidebyside,375,2,ldiskfs/kernel_patches/patches/ext4_pdirop-rhel6.patch
>
> Cheers, Andreas
>

Alrighty then, I will have a look at the existing patches. Thx!

Allison Henderson

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>