2009-08-13 16:10:02

by Curt Wohlgemuth

[permalink] [raw]
Subject: Questions on ext4 and writeback

I've got a question about how ext4 handles the writeback control fields in
ext4_da_writepages().

I understand how the big picture works: since we're doing delayed
allocation, when we're asked to write out a range of dirty pages, we need to
do allocation *now*, and we want to do allocation on the largest contiguous
range of pages possible.

So when __mpage_da_writepage() finds a page discontinuity, we submit the
page extent we have so far for I/O, and call redirty_page_for_writepage() on
the current page, skip the rest of the pages in the pagevec, and wait for a
new transaction.

On return from write_cache_pages(), ext4_da_writepages() will check the
return value from __mpage_da_writepage(), and if it's MPAGE_DA_EXTENT_TAIL,
will bump the number of pages written, and possibly cause another loop to
handle the rest of the pages (after the discontinuity).

What I don't understand is why wbc->pages_skipped is reset in this case.

I *think* that ext4_da_writepages() is trying to undo the effect of
redirty_page_for_writepage() to increment pages_skipped -- since we didn't
really skip this page, we only postponed its handling to the next pagevec.

But the actual submittal of I/O for the previous extent might cause
pages_skipped to be bumped, right? Removing these increments might cause
the accounting to be incorrect, it seems to me.

I think it would be safer to explicitly decrement wbc->pages_skipped after a
call to redirty_page_for_writepage(), rather than reset this value to what
it was when ext4_da_writepages() was called. Can somebody help me
understand where I might be mistaken?

Thanks,
Curt


2009-08-18 16:39:16

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Questions on ext4 and writeback

On Thu, Aug 13, 2009 at 09:09:58AM -0700, Curt Wohlgemuth wrote:
> But the actual submittal of I/O for the previous extent might cause
> pages_skipped to be bumped, right? Removing these increments might cause
> the accounting to be incorrect, it seems to me.

I don't see where the submission of an extent of pages for I/O would
cause page_skipped to be incremented or changed --- am I missing
something?

- Ted

2009-08-18 16:57:53

by Curt Wohlgemuth

[permalink] [raw]
Subject: Re: Questions on ext4 and writeback

Hi Ted:

On Tue, Aug 18, 2009 at 9:39 AM, Theodore Tso<[email protected]> wrote:
> On Thu, Aug 13, 2009 at 09:09:58AM -0700, Curt Wohlgemuth wrote:
>> But the actual submittal of I/O for the previous extent might cause
>> pages_skipped to be bumped, right? ?Removing these increments might cause
>> the accounting to be incorrect, it seems to me.
>
> I don't see where the submission of an extent of pages for I/O would
> cause page_skipped to be incremented or changed --- am I missing
> something?

Probably not. But it seems to me that a call order of

write_cache_pages -> __mpage_da_writepage -> mpage_da_submit_io ->
ext4_writepage

can cause pages_skipped to be incremented, either directly in
ext4_writepage() (page has delayed/unwritted buffers) or in
__block_write_full_page() (buffer already locked).

In fact, in mpage_da_submit_io(), pages_written is only incremented if
pages_skipped hasn't been bumped -- so that routine already knows that
pages_skipped might be changed on I/O submit.

If this happens, ext4_da_writepages() will wipe out the fact that
pages_skipped was changed during submittal, won't it?

Thanks,
Curt

2009-08-20 07:05:46

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: Questions on ext4 and writeback

On Tue, Aug 18, 2009 at 09:57:49AM -0700, Curt Wohlgemuth wrote:
> Hi Ted:
>
> On Tue, Aug 18, 2009 at 9:39 AM, Theodore Tso<[email protected]> wrote:
> > On Thu, Aug 13, 2009 at 09:09:58AM -0700, Curt Wohlgemuth wrote:
> >> But the actual submittal of I/O for the previous extent might cause
> >> pages_skipped to be bumped, right? ?Removing these increments might cause
> >> the accounting to be incorrect, it seems to me.
> >
> > I don't see where the submission of an extent of pages for I/O would
> > cause page_skipped to be incremented or changed --- am I missing
> > something?
>
> Probably not. But it seems to me that a call order of
>
> write_cache_pages -> __mpage_da_writepage -> mpage_da_submit_io ->
> ext4_writepage
>
> can cause pages_skipped to be incremented, either directly in
> ext4_writepage() (page has delayed/unwritted buffers) or in
> __block_write_full_page() (buffer already locked).
>
> In fact, in mpage_da_submit_io(), pages_written is only incremented if
> pages_skipped hasn't been bumped -- so that routine already knows that
> pages_skipped might be changed on I/O submit.
>
> If this happens, ext4_da_writepages() will wipe out the fact that
> pages_skipped was changed during submittal, won't it?
>

In that case we would have set MPAGE_DA_EXTENT_TAIL and we loop again
in the writepages, trying to write again from the beginning and writing the
earlier skipped pages.

In otherwords we retry in writepages untill we are able to write all the
nr_to_write pages.

-aneesh