LinuxLists.cc - [PATCH 05/13] writeback: rework the loop termination condition in write_cache

2024-02-03 08:18:08

Subject: [PATCH 05/13] writeback: rework the loop termination condition in write_cache_pages

Rework the way we deal with the cleanup after the writepage call.

First handle the magic AOP_WRITEPAGE_ACTIVATE separately from real error
returns to get it out of the way of the actual error handling path.

The split the handling on intgrity vs non-integrity branches first,
and return early using a goto for the non-ingegrity early loop condition
to remove the need for the done and done_index local variables, and for
assigning the error to ret when we can just return error directly.

Signed-off-by: Christoph Hellwig <[email protected]>
---
mm/page-writeback.c | 84 ++++++++++++++++++---------------------------
1 file changed, 33 insertions(+), 51 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index c7c494526bc650..88b2c4c111c01b 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2396,13 +2396,12 @@ int write_cache_pages(struct address_space *mapping,
void *data)
{
int ret = 0;
- int done = 0;
int error;
struct folio_batch fbatch;
+ struct folio *folio;
int nr_folios;
pgoff_t index;
pgoff_t end; /* Inclusive */
- pgoff_t done_index;
xa_mark_t tag;

folio_batch_init(&fbatch);
@@ -2419,8 +2418,7 @@ int write_cache_pages(struct address_space *mapping,
} else {
tag = PAGECACHE_TAG_DIRTY;
}
- done_index = index;
- while (!done && (index <= end)) {
+ while (index <= end) {
int i;

nr_folios = filemap_get_folios_tag(mapping, &index, end,
@@ -2430,11 +2428,7 @@ int write_cache_pages(struct address_space *mapping,
break;

for (i = 0; i < nr_folios; i++) {
- struct folio *folio = fbatch.folios[i];
- unsigned long nr;
-
- done_index = folio->index;
-
+ folio = fbatch.folios[i];
folio_lock(folio);

/*
@@ -2469,45 +2463,32 @@ int write_cache_pages(struct address_space *mapping,

trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
error = writepage(folio, wbc, data);
- nr = folio_nr_pages(folio);
- wbc->nr_to_write -= nr;
- if (unlikely(error)) {
- /*
- * Handle errors according to the type of
- * writeback. There's no need to continue for
- * background writeback. Just push done_index
- * past this page so media errors won't choke
- * writeout for the entire file. For integrity
- * writeback, we must process the entire dirty
- * set regardless of errors because the fs may
- * still have state to clear for each page. In
- * that case we continue processing and return
- * the first error.
- */
- if (error == AOP_WRITEPAGE_ACTIVATE) {
- folio_unlock(folio);
- error = 0;
- } else if (wbc->sync_mode != WB_SYNC_ALL) {
- ret = error;
- done_index = folio->index + nr;
- done = 1;
- break;
- }
- if (!ret)
- ret = error;
+ wbc->nr_to_write -= folio_nr_pages(folio);
+
+ if (error == AOP_WRITEPAGE_ACTIVATE) {
+ folio_unlock(folio);
+ error = 0;
}

/*
- * We stop writing back only if we are not doing
- * integrity sync. In case of integrity sync we have to
- * keep going until we have written all the pages
- * we tagged for writeback prior to entering this loop.
+ * For integrity writeback we have to keep going until
+ * we have written all the folios we tagged for
+ * writeback above, even if we run past wbc->nr_to_write
+ * or encounter errors.
+ * We stash away the first error we encounter in
+ * wbc->saved_err so that it can be retrieved when we're
+ * done. This is because the file system may still have
+ * state to clear for each folio.
+ *
+ * For background writeback we exit as soon as we run
+ * past wbc->nr_to_write or encounter the first error.
*/
- done_index = folio->index + nr;
- if (wbc->nr_to_write <= 0 &&
- wbc->sync_mode == WB_SYNC_NONE) {
- done = 1;
- break;
+ if (wbc->sync_mode == WB_SYNC_ALL) {
+ if (error && !ret)
+ ret = error;
+ } else {
+ if (error || wbc->nr_to_write <= 0)
+ goto done;
}
}
folio_batch_release(&fbatch);
@@ -2524,14 +2505,15 @@ int write_cache_pages(struct address_space *mapping,
* of the file if we are called again, which can only happen due to
* -ENOMEM from the file system.
*/
- if (wbc->range_cyclic) {
- if (done)
- mapping->writeback_index = done_index;
- else
- mapping->writeback_index = 0;
- }
-
+ if (wbc->range_cyclic)
+ mapping->writeback_index = 0;
return ret;
+
+done:
+ folio_batch_release(&fbatch);
+ if (wbc->range_cyclic)
+ mapping->writeback_index = folio->index + folio_nr_pages(folio);
+ return error;
}
EXPORT_SYMBOL(write_cache_pages);

--
2.39.2

2024-02-05 11:34:33

by Jan Kara

[permalink] [raw]

Subject: Re: [PATCH 05/13] writeback: rework the loop termination condition in write_cache_pages

On Sat 03-02-24 08:11:39, Christoph Hellwig wrote:
> Rework the way we deal with the cleanup after the writepage call.
>
> First handle the magic AOP_WRITEPAGE_ACTIVATE separately from real error
> returns to get it out of the way of the actual error handling path.
>
> The split the handling on intgrity vs non-integrity branches first,
> and return early using a goto for the non-ingegrity early loop condition
> to remove the need for the done and done_index local variables, and for
> assigning the error to ret when we can just return error directly.
>
> Signed-off-by: Christoph Hellwig <[email protected]>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

> ---
> mm/page-writeback.c | 84 ++++++++++++++++++---------------------------
> 1 file changed, 33 insertions(+), 51 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index c7c494526bc650..88b2c4c111c01b 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2396,13 +2396,12 @@ int write_cache_pages(struct address_space *mapping,
> void *data)
> {
> int ret = 0;
> - int done = 0;
> int error;
> struct folio_batch fbatch;
> + struct folio *folio;
> int nr_folios;
> pgoff_t index;
> pgoff_t end; /* Inclusive */
> - pgoff_t done_index;
> xa_mark_t tag;
>
> folio_batch_init(&fbatch);
> @@ -2419,8 +2418,7 @@ int write_cache_pages(struct address_space *mapping,
> } else {
> tag = PAGECACHE_TAG_DIRTY;
> }
> - done_index = index;
> - while (!done && (index <= end)) {
> + while (index <= end) {
> int i;
>
> nr_folios = filemap_get_folios_tag(mapping, &index, end,
> @@ -2430,11 +2428,7 @@ int write_cache_pages(struct address_space *mapping,
> break;
>
> for (i = 0; i < nr_folios; i++) {
> - struct folio *folio = fbatch.folios[i];
> - unsigned long nr;
> -
> - done_index = folio->index;
> -
> + folio = fbatch.folios[i];
> folio_lock(folio);
>
> /*
> @@ -2469,45 +2463,32 @@ int write_cache_pages(struct address_space *mapping,
>
> trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> error = writepage(folio, wbc, data);
> - nr = folio_nr_pages(folio);
> - wbc->nr_to_write -= nr;
> - if (unlikely(error)) {
> - /*
> - * Handle errors according to the type of
> - * writeback. There's no need to continue for
> - * background writeback. Just push done_index
> - * past this page so media errors won't choke
> - * writeout for the entire file. For integrity
> - * writeback, we must process the entire dirty
> - * set regardless of errors because the fs may
> - * still have state to clear for each page. In
> - * that case we continue processing and return
> - * the first error.
> - */
> - if (error == AOP_WRITEPAGE_ACTIVATE) {
> - folio_unlock(folio);
> - error = 0;
> - } else if (wbc->sync_mode != WB_SYNC_ALL) {
> - ret = error;
> - done_index = folio->index + nr;
> - done = 1;
> - break;
> - }
> - if (!ret)
> - ret = error;
> + wbc->nr_to_write -= folio_nr_pages(folio);
> +
> + if (error == AOP_WRITEPAGE_ACTIVATE) {
> + folio_unlock(folio);
> + error = 0;
> }
>
> /*
> - * We stop writing back only if we are not doing
> - * integrity sync. In case of integrity sync we have to
> - * keep going until we have written all the pages
> - * we tagged for writeback prior to entering this loop.
> + * For integrity writeback we have to keep going until
> + * we have written all the folios we tagged for
> + * writeback above, even if we run past wbc->nr_to_write
> + * or encounter errors.
> + * We stash away the first error we encounter in
> + * wbc->saved_err so that it can be retrieved when we're
> + * done. This is because the file system may still have
> + * state to clear for each folio.
> + *
> + * For background writeback we exit as soon as we run
> + * past wbc->nr_to_write or encounter the first error.
> */
> - done_index = folio->index + nr;
> - if (wbc->nr_to_write <= 0 &&
> - wbc->sync_mode == WB_SYNC_NONE) {
> - done = 1;
> - break;
> + if (wbc->sync_mode == WB_SYNC_ALL) {
> + if (error && !ret)
> + ret = error;
> + } else {
> + if (error || wbc->nr_to_write <= 0)
> + goto done;
> }
> }
> folio_batch_release(&fbatch);
> @@ -2524,14 +2505,15 @@ int write_cache_pages(struct address_space *mapping,
> * of the file if we are called again, which can only happen due to
> * -ENOMEM from the file system.
> */
> - if (wbc->range_cyclic) {
> - if (done)
> - mapping->writeback_index = done_index;
> - else
> - mapping->writeback_index = 0;
> - }
> -
> + if (wbc->range_cyclic)
> + mapping->writeback_index = 0;
> return ret;
> +
> +done:
> + folio_batch_release(&fbatch);
> + if (wbc->range_cyclic)
> + mapping->writeback_index = folio->index + folio_nr_pages(folio);
> + return error;
> }
> EXPORT_SYMBOL(write_cache_pages);
>
> --
> 2.39.2
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2024-02-05 15:32:34

by Brian Foster

[permalink] [raw]

Subject: Re: [PATCH 05/13] writeback: rework the loop termination condition in write_cache_pages

On Sat, Feb 03, 2024 at 08:11:39AM +0100, Christoph Hellwig wrote:
> Rework the way we deal with the cleanup after the writepage call.
>
> First handle the magic AOP_WRITEPAGE_ACTIVATE separately from real error
> returns to get it out of the way of the actual error handling path.
>
> The split the handling on intgrity vs non-integrity branches first,
> and return early using a goto for the non-ingegrity early loop condition
> to remove the need for the done and done_index local variables, and for
> assigning the error to ret when we can just return error directly.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> mm/page-writeback.c | 84 ++++++++++++++++++---------------------------
> 1 file changed, 33 insertions(+), 51 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index c7c494526bc650..88b2c4c111c01b 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
..
> @@ -2524,14 +2505,15 @@ int write_cache_pages(struct address_space *mapping,
> * of the file if we are called again, which can only happen due to
> * -ENOMEM from the file system.
> */
> - if (wbc->range_cyclic) {
> - if (done)
> - mapping->writeback_index = done_index;
> - else
> - mapping->writeback_index = 0;
> - }
> -
> + if (wbc->range_cyclic)
> + mapping->writeback_index = 0;
> return ret;
> +
> +done:
> + folio_batch_release(&fbatch);
> + if (wbc->range_cyclic)
> + mapping->writeback_index = folio->index + folio_nr_pages(folio);

Shouldn't this release the batch after we're done accessing folio? With
that addressed:

Reviewed-by: Brian Foster <[email protected]>

> + return error;
> }
> EXPORT_SYMBOL(write_cache_pages);
>
> --
> 2.39.2
>
>