2011-11-23 11:48:31

by Yongqiang Yang

[permalink] [raw]
Subject: [PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize

If there is a unwritten but clean buffer in a page and there is a dirty buffer
after the buffer, then mpage_submit_io does not write the dirty buffer out.
As a result, da_writepages loops forever.

This patch fixes the problem by checking dirty flag.

Signed-off-by: Yongqiang Yang <[email protected]>
---
fs/ext4/inode.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 755f6c7..20a1d17 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1339,8 +1339,11 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
clear_buffer_unwritten(bh);
}

- /* skip page if block allocation undone */
- if (buffer_delay(bh) || buffer_unwritten(bh))
+ /*
+ * skip page if block allocation undone and
+ * block is dirty
+ */
+ if (ext4_bh_delay_or_unwritten(NULL, bh))
skip_page = 1;
bh = bh->b_this_page;
block_start += bh->b_size;
--
1.7.5.1



2011-11-23 11:48:34

by Yongqiang Yang

[permalink] [raw]
Subject: [PATCH 2/2] ext4: let ext4_discard_partial_buffers handle pages without buffers correctly

If a page has been read to memory and never been written, it has no buffers,
but we should handle the page in truncate or punch hole.

Vfs code of writing operations has handled holes correctly, so this patch removes
the code handling hole in writing operations.

Signed-off-by: Yongqiang Yang <[email protected]>
---
fs/ext4/inode.c | 41 ++---------------------------------------
1 files changed, 2 insertions(+), 39 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 20a1d17..bd793d6 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2385,7 +2385,6 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
pgoff_t index;
struct inode *inode = mapping->host;
handle_t *handle;
- loff_t page_len;

index = pos >> PAGE_CACHE_SHIFT;

@@ -2432,13 +2431,6 @@ retry:
*/
if (pos + len > inode->i_size)
ext4_truncate_failed_write(inode);
- } else {
- page_len = pos & (PAGE_CACHE_SIZE - 1);
- if (page_len > 0) {
- ret = ext4_discard_partial_page_buffers_no_lock(handle,
- inode, page, pos - page_len, page_len,
- EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED);
- }
}

if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
@@ -2481,7 +2473,6 @@ static int ext4_da_write_end(struct file *file,
loff_t new_i_size;
unsigned long start, end;
int write_mode = (int)(unsigned long)fsdata;
- loff_t page_len;

if (write_mode == FALL_BACK_TO_NONDELALLOC) {
if (ext4_should_order_data(inode)) {
@@ -2530,16 +2521,6 @@ static int ext4_da_write_end(struct file *file,
}
ret2 = generic_write_end(file, mapping, pos, len, copied,
page, fsdata);
-
- page_len = PAGE_CACHE_SIZE -
- ((pos + copied - 1) & (PAGE_CACHE_SIZE - 1));
-
- if (page_len > 0) {
- ret = ext4_discard_partial_page_buffers_no_lock(handle,
- inode, page, pos + copied - 1, page_len,
- EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED);
- }
-
copied = ret2;
if (ret2 < 0)
ret = ret2;
@@ -3201,26 +3182,8 @@ int ext4_discard_partial_page_buffers_no_lock(handle_t *handle,

iblock = index << (PAGE_CACHE_SHIFT - inode->i_sb->s_blocksize_bits);

- if (!page_has_buffers(page)) {
- /*
- * If the range to be discarded covers a partial block
- * we need to get the page buffers. This is because
- * partial blocks cannot be released and the page needs
- * to be updated with the contents of the block before
- * we write the zeros on top of it.
- */
- if ((from & (blocksize - 1)) ||
- ((from + length) & (blocksize - 1))) {
- create_empty_buffers(page, blocksize, 0);
- } else {
- /*
- * If there are no partial blocks,
- * there is nothing to update,
- * so we can return now
- */
- return 0;
- }
- }
+ if (!page_has_buffers(page))
+ create_empty_buffers(page, blocksize, 0);

/* Find the buffer that contains "offset" */
bh = page_buffers(page);
--
1.7.5.1


2011-11-26 15:08:35

by Tao Ma

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize

Hi Yongqiang,
On 11/23/2011 05:15 PM, Yongqiang Yang wrote:
> If there is a unwritten but clean buffer in a page and there is a dirty buffer
> after the buffer, then mpage_submit_io does not write the dirty buffer out.
> As a result, da_writepages loops forever.
Did you ever meet with this bug or just find it to be a possible bug by
skimming the code? Actually, I can't find a proper way to get a buffer
which can satisfy the check.

Thanks
Tao
>
> This patch fixes the problem by checking dirty flag.
>
> Signed-off-by: Yongqiang Yang <[email protected]>
> ---
> fs/ext4/inode.c | 7 +++++--
> 1 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 755f6c7..20a1d17 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1339,8 +1339,11 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
> clear_buffer_unwritten(bh);
> }
>
> - /* skip page if block allocation undone */
> - if (buffer_delay(bh) || buffer_unwritten(bh))
> + /*
> + * skip page if block allocation undone and
> + * block is dirty
> + */
> + if (ext4_bh_delay_or_unwritten(NULL, bh))
> skip_page = 1;
> bh = bh->b_this_page;
> block_start += bh->b_size;



2011-11-26 15:23:14

by Tao Ma

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize

On 11/26/2011 11:13 PM, Yongqiang Yang wrote:
> Hi,
>
> It can be tested by xfstests 61 or 91, I did not remember which one.
ok, I will try it later. But I still wonder how it happens. See my
comments below.
>
> It can be reproduced by reading on a fallocted block and write the block
> after the fallocted block. Then the written block can not be written
> out by da_writepages.
Why? in read no bh will be created, and in write only the written bh
will be mapped and set unwritten. How could that happen? Sorry, but this
explanation doesn't convince me.
>
> Yongqiang.
>
> On Saturday, November 26, 2011, Tao Ma <[email protected] <mailto:[email protected]>> wrote:
>> Hi Yongqiang,
>> On 11/23/2011 05:15 PM, Yongqiang Yang wrote:
>>> If there is a unwritten but clean buffer in a page and there is a
> dirty buffer
>>> after the buffer, then mpage_submit_io does not write the dirty
> buffer out.
>>> As a result, da_writepages loops forever.
>> Did you ever meet with this bug or just find it to be a possible bug by
>> skimming the code? Actually, I can't find a proper way to get a buffer
>> which can satisfy the check.
>>
>> Thanks
>> Tao
>>>
>>> This patch fixes the problem by checking dirty flag.
>>>
>>> Signed-off-by: Yongqiang Yang <[email protected]
> <mailto:[email protected]>>
>>> ---
>>> fs/ext4/inode.c | 7 +++++--
>>> 1 files changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>> index 755f6c7..20a1d17 100644
>>> --- a/fs/ext4/inode.c
>>> +++ b/fs/ext4/inode.c
>>> @@ -1339,8 +1339,11 @@ static int mpage_da_submit_io(struct
> mpage_da_data *mpd,
>>> clear_buffer_unwritten(bh);
>>> }
>>>
>>> - /* skip page if block allocation undone */
>>> - if (buffer_delay(bh) ||
> buffer_unwritten(bh))
>>> + /*
>>> + * skip page if block allocation undone and
>>> + * block is dirty
>>> + */
>>> + if (ext4_bh_delay_or_unwritten(NULL, bh))
>>> skip_page = 1;
>>> bh = bh->b_this_page;
>>> block_start += bh->b_size;
>>
>>
>>
>
> --
> Best Wishes
> Yongqiang Yang
>


2011-11-27 08:59:40

by Tao Ma

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize

On 11/27/2011 09:29 AM, Yongqiang Yang wrote:
>
>
> On Saturday, November 26, 2011, Tao Ma <[email protected] <mailto:[email protected]>> wrote:
>> On 11/26/2011 11:13 PM, Yongqiang Yang wrote:
>>> Hi,
>>>
>>> It can be tested by xfstests 61 or 91, I did not remember which one.
>> ok, I will try it later. But I still wonder how it happens. See my
>> comments below.
>>>
>>> It can be reproduced by reading on a fallocted block and write the block
>>> after the fallocted block. Then the written block can not be written
>>> out by da_writepages.
>> Why? in read no bh will be created, and in write only the written bh
>> will be mapped and set unwritten. How could that happen? Sorry, but this
>> explanation doesn't convince me.
> Ok! The bug was found with punch hole enabled, and punch hole creates
> empty buffers sometimes.
> Actually, punching hole can read the whole page like read operation
> does, but it do not. May be this is the source of the problem.
> The problemastic function is discard_partial_buffers. I post a patch
> fixing another bug in it in this series. If we read whole page in this
> function, the problem can also be solved.
Got it and thanks for the explanation. So we introduced a bug and then
fix it in another way. ;)

Thanks
Tao
>
> Yongqiang.
>>>
>>> Yongqiang.
>>>
>>> On Saturday, November 26, 2011, Tao Ma <[email protected] <mailto:[email protected]>
> <mailto:[email protected] <mailto:[email protected]>>> wrote:
>>>> Hi Yongqiang,
>>>> On 11/23/2011 05:15 PM, Yongqiang Yang wrote:
>>>>> If there is a unwritten but clean buffer in a page and there is a
>>> dirty buffer
>>>>> after the buffer, then mpage_submit_io does not write the dirty
>>> buffer out.
>>>>> As a result, da_writepages loops forever.
>>>> Did you ever meet with this bug or just find it to be a possible bug by
>>>> skimming the code? Actually, I can't find a proper way to get a buffer
>>>> which can satisfy the check.
>>>>
>>>> Thanks
>>>> Tao
>>>>>
>>>>> This patch fixes the problem by checking dirty flag.
>>>>>
>>>>> Signed-off-by: Yongqiang Yang <[email protected]
> <mailto:[email protected]>
>>> <mailto:[email protected] <mailto:[email protected]>>>
>>>>> ---
>>>>> fs/ext4/inode.c | 7 +++++--
>>>>> 1 files changed, 5 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>>>> index 755f6c7..20a1d17 100644
>>>>> --- a/fs/ext4/inode.c
>>>>> +++ b/fs/ext4/inode.c
>>>>> @@ -1339,8 +1339,11 @@ static int mpage_da_submit_io(struct
>>> mpage_da_data *mpd,
>>>>> clear_buffer_unwritten(bh);
>>>>> }
>>>>>
>>>>> - /* skip page if block allocation
> undone */
>>>>> - if (buffer_delay(bh) ||
>>> buffer_unwritten(bh))
>>>>> + /*
>>>>> + * skip page if block allocation
> undone and
>>>>> + * block is dirty
>>>>> + */
>>>>> + if (ext4_bh_delay_or_unwritten(NULL, bh))
>>>>> skip_page = 1;
>>>>> bh = bh->b_this_page;
>>>>> block_start += bh->b_size;
>>>>
>>>>
>>>>
>>>
>>> --
>>> Best Wishes
>>> Yongqiang Yang
>>>
>>
>>
>
> --
> Best Wishes
> Yongqiang Yang
>


2011-12-01 20:13:14

by Allison Henderson

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize

On 11/23/2011 02:15 AM, Yongqiang Yang wrote:
> If there is a unwritten but clean buffer in a page and there is a dirty buffer
> after the buffer, then mpage_submit_io does not write the dirty buffer out.
> As a result, da_writepages loops forever.
>
> This patch fixes the problem by checking dirty flag.
>
> Signed-off-by: Yongqiang Yang<[email protected]>
> ---
> fs/ext4/inode.c | 7 +++++--
> 1 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 755f6c7..20a1d17 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1339,8 +1339,11 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
> clear_buffer_unwritten(bh);
> }
>
> - /* skip page if block allocation undone */
> - if (buffer_delay(bh) || buffer_unwritten(bh))
> + /*
> + * skip page if block allocation undone and
> + * block is dirty
> + */
> + if (ext4_bh_delay_or_unwritten(NULL, bh))
> skip_page = 1;
> bh = bh->b_this_page;
> block_start += bh->b_size;

Hi Yongqiang,

Thank you for looking into the punch hole code, I know there's been some
recent bugs reported, so I am looking at it too. I've applied your
patch and ran it through an fsx stress test, and I notice there are some
failures, but it appears to run longer with the patch then with out it,
so it may not be the cause of the errors I'm seeing. I think maybe
something else may have happened between now and the last time it made
it through 24hr of fsx (at least for me :) ), so I'm continuing to look
through the recent code changes. I will keep folks posted on my
findings. Thx!

Allison Henderson


2011-12-02 01:15:47

by Yongqiang Yang

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize

On Fri, Dec 2, 2011 at 4:13 AM, Allison Henderson
<[email protected]> wrote:
> On 11/23/2011 02:15 AM, Yongqiang Yang wrote:
>>
>> If there is a unwritten but clean buffer in a page and there is a dirty
>> buffer
>> after the buffer, then mpage_submit_io does not write the dirty buffer
>> out.
>> As a result, da_writepages loops forever.
>>
>> This patch fixes the problem by checking dirty flag.
>>
>> Signed-off-by: Yongqiang Yang<[email protected]>
>> ---
>> ?fs/ext4/inode.c | ? ?7 +++++--
>> ?1 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 755f6c7..20a1d17 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -1339,8 +1339,11 @@ static int mpage_da_submit_io(struct mpage_da_data
>> *mpd,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?clear_buffer_unwritten(bh);
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?}
>>
>> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? /* skip page if block allocation undone */
>> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? if (buffer_delay(bh) ||
>> buffer_unwritten(bh))
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? /*
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?* skip page if block allocation undone
>> and
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?* block is dirty
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?*/
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? if (ext4_bh_delay_or_unwritten(NULL, bh))
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?skip_page = 1;
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bh = bh->b_this_page;
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?block_start += bh->b_size;
>
> Hi Yongqiang,
>
> Thank you for looking into the punch hole code, I know there's been some
> recent bugs reported, so I am looking at it too. ?I've applied your patch
> and ran it through an fsx stress test, and I notice there are some failures,
> but it appears to run longer with the patch then with out it, so it may not
> be the cause of the errors I'm seeing. I think maybe something else may have
> happened between now and the last time it made it through 24hr of fsx (at
> least for me :) ), so I'm continuing to look through the recent code
On the other hand, xfstests have a lot of changes since your last
test. I am not sure if original xfstests did not discover some
errors.
> changes. ?I will keep folks posted on my findings. ?Thx!
Did you test it by multi-thread tests or single thread tests? If
multi-thread, I suggest that we hold the i_mutex in punching hole.


Yongqiang.
>
> Allison Henderson
>
>



--
Best Wishes
Yongqiang Yang

2011-12-02 06:47:09

by Allison Henderson

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize

On 12/01/2011 06:15 PM, Yongqiang Yang wrote:
> On Fri, Dec 2, 2011 at 4:13 AM, Allison Henderson
> <[email protected]> wrote:
>> On 11/23/2011 02:15 AM, Yongqiang Yang wrote:
>>>
>>> If there is a unwritten but clean buffer in a page and there is a dirty
>>> buffer
>>> after the buffer, then mpage_submit_io does not write the dirty buffer
>>> out.
>>> As a result, da_writepages loops forever.
>>>
>>> This patch fixes the problem by checking dirty flag.
>>>
>>> Signed-off-by: Yongqiang Yang<[email protected]>
>>> ---
>>> fs/ext4/inode.c | 7 +++++--
>>> 1 files changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>> index 755f6c7..20a1d17 100644
>>> --- a/fs/ext4/inode.c
>>> +++ b/fs/ext4/inode.c
>>> @@ -1339,8 +1339,11 @@ static int mpage_da_submit_io(struct mpage_da_data
>>> *mpd,
>>> clear_buffer_unwritten(bh);
>>> }
>>>
>>> - /* skip page if block allocation undone */
>>> - if (buffer_delay(bh) ||
>>> buffer_unwritten(bh))
>>> + /*
>>> + * skip page if block allocation undone
>>> and
>>> + * block is dirty
>>> + */
>>> + if (ext4_bh_delay_or_unwritten(NULL, bh))
>>> skip_page = 1;
>>> bh = bh->b_this_page;
>>> block_start += bh->b_size;
>>
>> Hi Yongqiang,
>>
>> Thank you for looking into the punch hole code, I know there's been some
>> recent bugs reported, so I am looking at it too. I've applied your patch
>> and ran it through an fsx stress test, and I notice there are some failures,
>> but it appears to run longer with the patch then with out it, so it may not
>> be the cause of the errors I'm seeing. I think maybe something else may have
>> happened between now and the last time it made it through 24hr of fsx (at
>> least for me :) ), so I'm continuing to look through the recent code
> On the other hand, xfstests have a lot of changes since your last
> test. I am not sure if original xfstests did not discover some
> errors.
>> changes. I will keep folks posted on my findings. Thx!
> Did you test it by multi-thread tests or single thread tests? If
> multi-thread, I suggest that we hold the i_mutex in punching hole.

Alrighty, well the test Im using is just the fsx test (in xfstests under
the ltp folder). I will check and see if there's been any updates to it
since then. The command I usually use is just "./fsx -d -S 1
/mnt/ext4MntPt/test" from the ltp folder. I dont think it's
multi-threaded, but the i_mutex lock is another work item on my plate
that I haven't gotten to yet. The reason we dont want to just lock
i_mutex is because folks are trying to reduce the use of i_mutex in
ext4. So the plan is to implement extent locks to replace i_mutex all
together. That's another project, but I will do a trial run with
i_mutex locked just to rule it out.

Allison Henderson

>
>
> Yongqiang.
>>
>> Allison Henderson
>>
>>
>
>
>


2011-12-14 03:05:50

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize

On Wed, Nov 23, 2011 at 05:15:25PM +0800, Yongqiang Yang wrote:
> If there is a unwritten but clean buffer in a page and there is a dirty buffer
> after the buffer, then mpage_submit_io does not write the dirty buffer out.
> As a result, da_writepages loops forever.
>
> This patch fixes the problem by checking dirty flag.
>
> Signed-off-by: Yongqiang Yang <[email protected]>

Thanks, applied.

- Ted

2011-12-14 03:06:00

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4: let ext4_discard_partial_buffers handle pages without buffers correctly

On Wed, Nov 23, 2011 at 05:15:26PM +0800, Yongqiang Yang wrote:
> If a page has been read to memory and never been written, it has no buffers,
> but we should handle the page in truncate or punch hole.
>
> Vfs code of writing operations has handled holes correctly, so this patch removes
> the code handling hole in writing operations.
>
> Signed-off-by: Yongqiang Yang <[email protected]>

Thanks, applied.

- Ted