2022-09-01 22:03:29

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 00/23] Convert to filemap_get_folios_tag()

This patch series replaces find_get_pages_range_tag() with
filemap_get_folios_tag(). This also allows the removal of multiple
calls to compound_head() throughout.
It also makes a good chunk of the straightforward conversions to folios,
and takes the opportunity to introduce a function that grabs a folio
from the pagecache.

F2fs and Ceph have quite alot of work to be done regarding folios, so
for now those patches only have the changes necessary for the removal of
find_get_pages_range_tag(), and only support folios of size 1 (which is
all they use right now anyways).

I've run xfstests on btrfs, ext4, f2fs, and nilfs2, but more testing may be
beneficial. The page-writeback and filemap changes implicitly work. Testing
and review of the other changes (afs, ceph, cifs, gfs2) would be appreciated.

Vishal Moola (Oracle) (23):
pagemap: Add filemap_grab_folio()
filemap: Added filemap_get_folios_tag()
filemap: Convert __filemap_fdatawait_range() to use
filemap_get_folios_tag()
page-writeback: Convert write_cache_pages() to use
filemap_get_folios_tag()
afs: Convert afs_writepages_region() to use filemap_get_folios_tag()
btrfs: Convert btree_write_cache_pages() to use
filemap_get_folio_tag()
btrfs: Convert extent_write_cache_pages() to use
filemap_get_folios_tag()
ceph: Convert ceph_writepages_start() to use filemap_get_folios_tag()
cifs: Convert wdata_alloc_and_fillpages() to use
filemap_get_folios_tag()
ext4: Convert mpage_prepare_extent_to_map() to use
filemap_get_folios_tag()
f2fs: Convert f2fs_fsync_node_pages() to use filemap_get_folios_tag()
f2fs: Convert f2fs_flush_inline_data() to use filemap_get_folios_tag()
f2fs: Convert f2fs_sync_node_pages() to use filemap_get_folios_tag()
f2fs: Convert f2fs_write_cache_pages() to use filemap_get_folios_tag()
f2fs: Convert last_fsync_dnode() to use filemap_get_folios_tag()
f2fs: Convert f2fs_sync_meta_pages() to use filemap_get_folios_tag()
gfs2: Convert gfs2_write_cache_jdata() to use filemap_get_folios_tag()
nilfs2: Convert nilfs_lookup_dirty_data_buffers() to use
filemap_get_folios_tag()
nilfs2: Convert nilfs_lookup_dirty_node_buffers() to use
filemap_get_folios_tag()
nilfs2: Convert nilfs_btree_lookup_dirty_buffers() to use
filemap_get_folios_tag()
nilfs2: Convert nilfs_copy_dirty_pages() to use
filemap_get_folios_tag()
nilfs2: Convert nilfs_clear_dirty_pages() to use
filemap_get_folios_tag()
filemap: Remove find_get_pages_range_tag()

fs/afs/write.c | 114 +++++++++++++++++----------------
fs/btrfs/extent_io.c | 57 +++++++++--------
fs/ceph/addr.c | 138 ++++++++++++++++++++--------------------
fs/cifs/file.c | 33 +++++++++-
fs/ext4/inode.c | 55 ++++++++--------
fs/f2fs/checkpoint.c | 49 +++++++-------
fs/f2fs/compress.c | 13 ++--
fs/f2fs/data.c | 67 ++++++++++---------
fs/f2fs/f2fs.h | 5 +-
fs/f2fs/node.c | 72 +++++++++++----------
fs/gfs2/aops.c | 64 ++++++++++---------
fs/nilfs2/btree.c | 14 ++--
fs/nilfs2/page.c | 59 ++++++++---------
fs/nilfs2/segment.c | 44 +++++++------
include/linux/pagemap.h | 32 +++++++---
include/linux/pagevec.h | 8 ---
mm/filemap.c | 87 ++++++++++++-------------
mm/page-writeback.c | 44 +++++++------
mm/swap.c | 10 ---
19 files changed, 506 insertions(+), 459 deletions(-)

--
2.36.1


2022-09-01 22:03:37

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 03/23] filemap: Convert __filemap_fdatawait_range() to use filemap_get_folios_tag()

Converted function to use folios. This is in preparation for the removal
of find_get_pages_range_tag().

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
mm/filemap.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 3ded72a65668..435fc53b3f2f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -503,28 +503,30 @@ static void __filemap_fdatawait_range(struct address_space *mapping,
{
pgoff_t index = start_byte >> PAGE_SHIFT;
pgoff_t end = end_byte >> PAGE_SHIFT;
- struct pagevec pvec;
- int nr_pages;
+ struct folio_batch fbatch;
+ unsigned nr_folios;

if (end_byte < start_byte)
return;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
+
while (index <= end) {
unsigned i;

- nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index,
- end, PAGECACHE_TAG_WRITEBACK);
- if (!nr_pages)
+ nr_folios = filemap_get_folios_tag(mapping, &index, end,
+ PAGECACHE_TAG_WRITEBACK, &fbatch);
+
+ if (!nr_folios)
break;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch.folios[i];

- wait_on_page_writeback(page);
- ClearPageError(page);
+ folio_wait_writeback(folio);
+ folio_clear_error(folio);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
}
--
2.36.1

2022-09-01 22:03:41

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 04/23] page-writeback: Convert write_cache_pages() to use filemap_get_folios_tag()

Converted function to use folios throughout. This is in preparation for
the removal of find_get_pages_range_tag().

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
mm/page-writeback.c | 44 +++++++++++++++++++++++---------------------
1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 032a7bf8d259..087165357a5a 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2285,15 +2285,15 @@ int write_cache_pages(struct address_space *mapping,
int ret = 0;
int done = 0;
int error;
- struct pagevec pvec;
- int nr_pages;
+ struct folio_batch fbatch;
+ int nr_folios;
pgoff_t index;
pgoff_t end; /* Inclusive */
pgoff_t done_index;
int range_whole = 0;
xa_mark_t tag;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
if (wbc->range_cyclic) {
index = mapping->writeback_index; /* prev offset */
end = -1;
@@ -2313,17 +2313,18 @@ int write_cache_pages(struct address_space *mapping,
while (!done && (index <= end)) {
int i;

- nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
- tag);
- if (nr_pages == 0)
+ nr_folios = filemap_get_folios_tag(mapping, &index, end,
+ tag, &fbatch);
+
+ if (nr_folios == 0)
break;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch.folios[i];

- done_index = page->index;
+ done_index = folio->index;

- lock_page(page);
+ folio_lock(folio);

/*
* Page truncated or invalidated. We can freely skip it
@@ -2333,30 +2334,30 @@ int write_cache_pages(struct address_space *mapping,
* even if there is now a new, dirty page at the same
* pagecache address.
*/
- if (unlikely(page->mapping != mapping)) {
+ if (unlikely(folio->mapping != mapping)) {
continue_unlock:
- unlock_page(page);
+ folio_unlock(folio);
continue;
}

- if (!PageDirty(page)) {
+ if (!folio_test_dirty(folio)) {
/* someone wrote it for us */
goto continue_unlock;
}

- if (PageWriteback(page)) {
+ if (folio_test_writeback(folio)) {
if (wbc->sync_mode != WB_SYNC_NONE)
- wait_on_page_writeback(page);
+ folio_wait_writeback(folio);
else
goto continue_unlock;
}

- BUG_ON(PageWriteback(page));
- if (!clear_page_dirty_for_io(page))
+ BUG_ON(folio_test_writeback(folio));
+ if (!folio_clear_dirty_for_io(folio))
goto continue_unlock;

trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
- error = (*writepage)(page, wbc, data);
+ error = writepage(&folio->page, wbc, data);
if (unlikely(error)) {
/*
* Handle errors according to the type of
@@ -2371,11 +2372,12 @@ int write_cache_pages(struct address_space *mapping,
* the first error.
*/
if (error == AOP_WRITEPAGE_ACTIVATE) {
- unlock_page(page);
+ folio_unlock(folio);
error = 0;
} else if (wbc->sync_mode != WB_SYNC_ALL) {
ret = error;
- done_index = page->index + 1;
+ done_index = folio->index +
+ folio_nr_pages(folio);
done = 1;
break;
}
@@ -2395,7 +2397,7 @@ int write_cache_pages(struct address_space *mapping,
break;
}
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}

--
2.36.1

2022-09-01 22:03:45

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 05/23] afs: Convert afs_writepages_region() to use filemap_get_folios_tag()

Convert to use folios throughout. This function is in preparation to
remove find_get_pages_range_tag().

Also modified this function to write the whole batch one at a time,
rather than calling for a new set every single write.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/afs/write.c | 114 +++++++++++++++++++++++++------------------------
1 file changed, 59 insertions(+), 55 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index 9ebdd36eaf2f..c17dbd82a38c 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -699,82 +699,86 @@ static int afs_writepages_region(struct address_space *mapping,
loff_t start, loff_t end, loff_t *_next)
{
struct folio *folio;
- struct page *head_page;
+ struct folio_batch fbatch;
ssize_t ret;
+ unsigned int i;
int n, skips = 0;

_enter("%llx,%llx,", start, end);
+ folio_batch_init(&fbatch);

do {
pgoff_t index = start / PAGE_SIZE;

- n = find_get_pages_range_tag(mapping, &index, end / PAGE_SIZE,
- PAGECACHE_TAG_DIRTY, 1, &head_page);
+ n = filemap_get_folios_tag(mapping, &index, end / PAGE_SIZE,
+ PAGECACHE_TAG_DIRTY, &fbatch);
+
if (!n)
break;
+ for (i = 0; i < n; i++) {
+ folio = fbatch.folios[i];
+ start = folio_pos(folio); /* May regress with THPs */

- folio = page_folio(head_page);
- start = folio_pos(folio); /* May regress with THPs */
-
- _debug("wback %lx", folio_index(folio));
+ _debug("wback %lx", folio_index(folio));

- /* At this point we hold neither the i_pages lock nor the
- * page lock: the page may be truncated or invalidated
- * (changing page->mapping to NULL), or even swizzled
- * back from swapper_space to tmpfs file mapping
- */
- if (wbc->sync_mode != WB_SYNC_NONE) {
- ret = folio_lock_killable(folio);
- if (ret < 0) {
- folio_put(folio);
- return ret;
- }
- } else {
- if (!folio_trylock(folio)) {
- folio_put(folio);
- return 0;
+ /* At this point we hold neither the i_pages lock nor the
+ * page lock: the page may be truncated or invalidated
+ * (changing page->mapping to NULL), or even swizzled
+ * back from swapper_space to tmpfs file mapping
+ */
+ if (wbc->sync_mode != WB_SYNC_NONE) {
+ ret = folio_lock_killable(folio);
+ if (ret < 0) {
+ folio_batch_release(&fbatch);
+ return ret;
+ }
+ } else {
+ if (!folio_trylock(folio))
+ continue;
}
- }

- if (folio_mapping(folio) != mapping ||
- !folio_test_dirty(folio)) {
- start += folio_size(folio);
- folio_unlock(folio);
- folio_put(folio);
- continue;
- }
+ if (folio->mapping != mapping ||
+ !folio_test_dirty(folio)) {
+ start += folio_size(folio);
+ folio_unlock(folio);
+ continue;
+ }

- if (folio_test_writeback(folio) ||
- folio_test_fscache(folio)) {
- folio_unlock(folio);
- if (wbc->sync_mode != WB_SYNC_NONE) {
- folio_wait_writeback(folio);
+ if (folio_test_writeback(folio) ||
+ folio_test_fscache(folio)) {
+ folio_unlock(folio);
+ if (wbc->sync_mode != WB_SYNC_NONE) {
+ folio_wait_writeback(folio);
#ifdef CONFIG_AFS_FSCACHE
- folio_wait_fscache(folio);
+ folio_wait_fscache(folio);
#endif
- } else {
- start += folio_size(folio);
+ } else {
+ start += folio_size(folio);
+ }
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ if (skips >= 5 || need_resched()) {
+ *_next = start;
+ _leave(" = 0 [%llx]", *_next);
+ return 0;
+ }
+ skips++;
+ }
+ continue;
}
- folio_put(folio);
- if (wbc->sync_mode == WB_SYNC_NONE) {
- if (skips >= 5 || need_resched())
- break;
- skips++;
+
+ if (!folio_clear_dirty_for_io(folio))
+ BUG();
+ ret = afs_write_back_from_locked_folio(mapping, wbc,
+ folio, start, end);
+ if (ret < 0) {
+ _leave(" = %zd", ret);
+ folio_batch_release(&fbatch);
+ return ret;
}
- continue;
- }

- if (!folio_clear_dirty_for_io(folio))
- BUG();
- ret = afs_write_back_from_locked_folio(mapping, wbc, folio, start, end);
- folio_put(folio);
- if (ret < 0) {
- _leave(" = %zd", ret);
- return ret;
+ start += ret;
}
-
- start += ret;
-
+ folio_batch_release(&fbatch);
cond_resched();
} while (wbc->nr_to_write > 0);

--
2.36.1

2022-09-01 22:04:24

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 06/23] btrfs: Convert btree_write_cache_pages() to use filemap_get_folio_tag()

Converted function to use folios throughout. This is in preparation for
the removal of find_get_pages_range_tag().

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/btrfs/extent_io.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index cf4f19e80e2f..d1fa072bfdd0 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4844,14 +4844,14 @@ int btree_write_cache_pages(struct address_space *mapping,
int ret = 0;
int done = 0;
int nr_to_write_done = 0;
- struct pagevec pvec;
- int nr_pages;
+ struct folio_batch fbatch;
+ unsigned int nr_folios;
pgoff_t index;
pgoff_t end; /* Inclusive */
int scanned = 0;
xa_mark_t tag;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
if (wbc->range_cyclic) {
index = mapping->writeback_index; /* Start from prev offset */
end = -1;
@@ -4874,14 +4874,15 @@ int btree_write_cache_pages(struct address_space *mapping,
if (wbc->sync_mode == WB_SYNC_ALL)
tag_pages_for_writeback(mapping, index, end);
while (!done && !nr_to_write_done && (index <= end) &&
- (nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
- tag))) {
+ (nr_folios = filemap_get_folios_tag(mapping, &index, end,
+ tag, &fbatch))) {
unsigned i;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch.folios[i];

- ret = submit_eb_page(page, wbc, &epd, &eb_context);
+ ret = submit_eb_page(&folio->page, wbc, &epd,
+ &eb_context);
if (ret == 0)
continue;
if (ret < 0) {
@@ -4896,7 +4897,7 @@ int btree_write_cache_pages(struct address_space *mapping,
*/
nr_to_write_done = wbc->nr_to_write <= 0;
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
if (!scanned && !done) {
--
2.36.1

2022-09-01 22:04:40

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 08/23] ceph: Convert ceph_writepages_start() to use filemap_get_folios_tag()

Convert function to use folios throughout. This is in preparation for
the removal of find_get_pages_range_tag().

This change does NOT support large folios. This shouldn't be an issue as
of now since ceph only utilizes folios of size 1 anyways, and there is a
lot of work to be done on ceph conversions to folios for later patches
at some point.

Also some minor renaming for consistency.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/ceph/addr.c | 138 +++++++++++++++++++++++++------------------------
1 file changed, 70 insertions(+), 68 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index dcf701b05cc1..33dbe55b08be 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -792,7 +792,7 @@ static int ceph_writepages_start(struct address_space *mapping,
struct ceph_vino vino = ceph_vino(inode);
pgoff_t index, start_index, end = -1;
struct ceph_snap_context *snapc = NULL, *last_snapc = NULL, *pgsnapc;
- struct pagevec pvec;
+ struct folio_batch fbatch;
int rc = 0;
unsigned int wsize = i_blocksize(inode);
struct ceph_osd_request *req = NULL;
@@ -821,7 +821,7 @@ static int ceph_writepages_start(struct address_space *mapping,
if (fsc->mount_options->wsize < wsize)
wsize = fsc->mount_options->wsize;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);

start_index = wbc->range_cyclic ? mapping->writeback_index : 0;
index = start_index;
@@ -869,9 +869,9 @@ static int ceph_writepages_start(struct address_space *mapping,

while (!done && index <= end) {
int num_ops = 0, op_idx;
- unsigned i, pvec_pages, max_pages, locked_pages = 0;
+ unsigned i, nr_folios, max_pages, locked_pages = 0;
struct page **pages = NULL, **data_pages;
- struct page *page;
+ struct folio *folio;
pgoff_t strip_unit_end = 0;
u64 offset = 0, len = 0;
bool from_pool = false;
@@ -879,28 +879,28 @@ static int ceph_writepages_start(struct address_space *mapping,
max_pages = wsize >> PAGE_SHIFT;

get_more_pages:
- pvec_pages = pagevec_lookup_range_tag(&pvec, mapping, &index,
- end, PAGECACHE_TAG_DIRTY);
- dout("pagevec_lookup_range_tag got %d\n", pvec_pages);
- if (!pvec_pages && !locked_pages)
+ nr_folios = filemap_get_folios_tag(mapping, &index,
+ end, PAGECACHE_TAG_DIRTY, &fbatch);
+ dout("filemap_get_folios_tag got %d\n", nr_folios);
+ if (!nr_folios && !locked_pages)
break;
- for (i = 0; i < pvec_pages && locked_pages < max_pages; i++) {
- page = pvec.pages[i];
- dout("? %p idx %lu\n", page, page->index);
+ for (i = 0; i < nr_folios && locked_pages < max_pages; i++) {
+ folio = fbatch.folios[i];
+ dout("? %p idx %lu\n", folio, folio->index);
if (locked_pages == 0)
- lock_page(page); /* first page */
- else if (!trylock_page(page))
+ folio_lock(folio); /* first folio */
+ else if (!folio_trylock(folio))
break;

/* only dirty pages, or our accounting breaks */
- if (unlikely(!PageDirty(page)) ||
- unlikely(page->mapping != mapping)) {
- dout("!dirty or !mapping %p\n", page);
- unlock_page(page);
+ if (unlikely(!folio_test_dirty(folio)) ||
+ unlikely(folio->mapping != mapping)) {
+ dout("!dirty or !mapping %p\n", folio);
+ folio_unlock(folio);
continue;
}
/* only if matching snap context */
- pgsnapc = page_snap_context(page);
+ pgsnapc = page_snap_context(&folio->page);
if (pgsnapc != snapc) {
dout("page snapc %p %lld != oldest %p %lld\n",
pgsnapc, pgsnapc->seq, snapc, snapc->seq);
@@ -908,11 +908,10 @@ static int ceph_writepages_start(struct address_space *mapping,
!ceph_wbc.head_snapc &&
wbc->sync_mode != WB_SYNC_NONE)
should_loop = true;
- unlock_page(page);
+ folio_unlock(folio);
continue;
}
- if (page_offset(page) >= ceph_wbc.i_size) {
- struct folio *folio = page_folio(page);
+ if (folio_pos(folio) >= ceph_wbc.i_size) {

dout("folio at %lu beyond eof %llu\n",
folio->index, ceph_wbc.i_size);
@@ -924,25 +923,26 @@ static int ceph_writepages_start(struct address_space *mapping,
folio_unlock(folio);
continue;
}
- if (strip_unit_end && (page->index > strip_unit_end)) {
- dout("end of strip unit %p\n", page);
- unlock_page(page);
+ if (strip_unit_end && (folio->index > strip_unit_end)) {
+ dout("end of strip unit %p\n", folio);
+ folio_unlock(folio);
break;
}
- if (PageWriteback(page) || PageFsCache(page)) {
+ if (folio_test_writeback(folio) ||
+ folio_test_fscache(folio)) {
if (wbc->sync_mode == WB_SYNC_NONE) {
- dout("%p under writeback\n", page);
- unlock_page(page);
+ dout("%p under writeback\n", folio);
+ folio_unlock(folio);
continue;
}
- dout("waiting on writeback %p\n", page);
- wait_on_page_writeback(page);
- wait_on_page_fscache(page);
+ dout("waiting on writeback %p\n", folio);
+ folio_wait_writeback(folio);
+ folio_wait_fscache(folio);
}

- if (!clear_page_dirty_for_io(page)) {
- dout("%p !clear_page_dirty_for_io\n", page);
- unlock_page(page);
+ if (!folio_clear_dirty_for_io(folio)) {
+ dout("%p !clear_page_dirty_for_io\n", folio);
+ folio_unlock(folio);
continue;
}

@@ -958,7 +958,7 @@ static int ceph_writepages_start(struct address_space *mapping,
u32 xlen;

/* prepare async write request */
- offset = (u64)page_offset(page);
+ offset = (u64)folio_pos(folio);
ceph_calc_file_object_mapping(&ci->i_layout,
offset, wsize,
&objnum, &objoff,
@@ -966,7 +966,7 @@ static int ceph_writepages_start(struct address_space *mapping,
len = xlen;

num_ops = 1;
- strip_unit_end = page->index +
+ strip_unit_end = folio->index +
((len - 1) >> PAGE_SHIFT);

BUG_ON(pages);
@@ -981,54 +981,53 @@ static int ceph_writepages_start(struct address_space *mapping,
}

len = 0;
- } else if (page->index !=
+ } else if (folio->index !=
(offset + len) >> PAGE_SHIFT) {
if (num_ops >= (from_pool ? CEPH_OSD_SLAB_OPS :
CEPH_OSD_MAX_OPS)) {
- redirty_page_for_writepage(wbc, page);
- unlock_page(page);
+ folio_redirty_for_writepage(wbc, folio);
+ folio_unlock(folio);
break;
}

num_ops++;
- offset = (u64)page_offset(page);
+ offset = (u64)folio_pos(folio);
len = 0;
}

- /* note position of first page in pvec */
+ /* note position of first page in fbatch */
dout("%p will write page %p idx %lu\n",
- inode, page, page->index);
+ inode, folio, folio->index);

if (atomic_long_inc_return(&fsc->writeback_count) >
CONGESTION_ON_THRESH(
fsc->mount_options->congestion_kb))
fsc->write_congested = true;

- pages[locked_pages++] = page;
- pvec.pages[i] = NULL;
+ pages[locked_pages++] = &folio->page;
+ fbatch.folios[i] = NULL;

- len += thp_size(page);
+ len += folio_size(folio);
}

/* did we get anything? */
if (!locked_pages)
- goto release_pvec_pages;
+ goto release_folio_batches;
if (i) {
unsigned j, n = 0;
- /* shift unused page to beginning of pvec */
- for (j = 0; j < pvec_pages; j++) {
- if (!pvec.pages[j])
+ /* shift unused folio to the beginning of fbatch */
+ for (j = 0; j < nr_folios; j++) {
+ if (!fbatch.folios[j])
continue;
if (n < j)
- pvec.pages[n] = pvec.pages[j];
+ fbatch.folios[n] = fbatch.folios[j];
n++;
}
- pvec.nr = n;
-
- if (pvec_pages && i == pvec_pages &&
+ fbatch.nr = n;
+ if (nr_folios && i == nr_folios &&
locked_pages < max_pages) {
- dout("reached end pvec, trying for more\n");
- pagevec_release(&pvec);
+ dout("reached end of fbatch, trying for more\n");
+ folio_batch_release(&fbatch);
goto get_more_pages;
}
}
@@ -1056,7 +1055,7 @@ static int ceph_writepages_start(struct address_space *mapping,
BUG_ON(IS_ERR(req));
}
BUG_ON(len < page_offset(pages[locked_pages - 1]) +
- thp_size(page) - offset);
+ folio_size(folio) - offset);

req->r_callback = writepages_finish;
req->r_inode = inode;
@@ -1098,7 +1097,7 @@ static int ceph_writepages_start(struct address_space *mapping,
set_page_writeback(pages[i]);
if (caching)
ceph_set_page_fscache(pages[i]);
- len += thp_size(page);
+ len += folio_size(folio);
}
ceph_fscache_write_to_cache(inode, offset, len, caching);

@@ -1108,7 +1107,7 @@ static int ceph_writepages_start(struct address_space *mapping,
/* writepages_finish() clears writeback pages
* according to the data length, so make sure
* data length covers all locked pages */
- u64 min_len = len + 1 - thp_size(page);
+ u64 min_len = len + 1 - folio_size(folio);
len = get_writepages_data_length(inode, pages[i - 1],
offset);
len = max(len, min_len);
@@ -1164,10 +1163,10 @@ static int ceph_writepages_start(struct address_space *mapping,
if (wbc->nr_to_write <= 0 && wbc->sync_mode == WB_SYNC_NONE)
done = true;

-release_pvec_pages:
- dout("pagevec_release on %d pages (%p)\n", (int)pvec.nr,
- pvec.nr ? pvec.pages[0] : NULL);
- pagevec_release(&pvec);
+release_folio_batches:
+ dout("folio_batch_release on %d batches (%p)", (int) fbatch.nr,
+ fbatch.nr ? fbatch.folios[0] : NULL);
+ folio_batch_release(&fbatch);
}

if (should_loop && !done) {
@@ -1180,19 +1179,22 @@ static int ceph_writepages_start(struct address_space *mapping,
if (wbc->sync_mode != WB_SYNC_NONE &&
start_index == 0 && /* all dirty pages were checked */
!ceph_wbc.head_snapc) {
- struct page *page;
+ struct folio *folio;
unsigned i, nr;
index = 0;
while ((index <= end) &&
- (nr = pagevec_lookup_tag(&pvec, mapping, &index,
- PAGECACHE_TAG_WRITEBACK))) {
+ (nr = filemap_get_folios_tag(mapping, &index,
+ (pgoff_t)-1,
+ PAGECACHE_TAG_WRITEBACK,
+ &fbatch))) {
for (i = 0; i < nr; i++) {
- page = pvec.pages[i];
- if (page_snap_context(page) != snapc)
+ folio = fbatch.folios[i];
+ if (page_snap_context(&folio->page) !=
+ snapc)
continue;
- wait_on_page_writeback(page);
+ folio_wait_writeback(folio);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
}
--
2.36.1

2022-09-01 22:05:00

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 09/23] cifs: Convert wdata_alloc_and_fillpages() to use filemap_get_folios_tag()

Convert function to use folios. This is in preparation for the removal
of find_get_pages_range_tag(). Now also supports the use of large
folios.

Since tofind might be larger than the max number of folios in a
folio_batch (15), we loop through filling in wdata->pages pulling more
batches until we either reach tofind pages or run out of folios.

This function may not return all pages in the last found folio before
tofind pages are reached.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/cifs/file.c | 33 ++++++++++++++++++++++++++++++---
1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index fa738adc031f..c4da53b57369 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2517,14 +2517,41 @@ wdata_alloc_and_fillpages(pgoff_t tofind, struct address_space *mapping,
unsigned int *found_pages)
{
struct cifs_writedata *wdata;
-
+ struct folio_batch fbatch;
+ unsigned int i, idx, p, nr;
wdata = cifs_writedata_alloc((unsigned int)tofind,
cifs_writev_complete);
if (!wdata)
return NULL;

- *found_pages = find_get_pages_range_tag(mapping, index, end,
- PAGECACHE_TAG_DIRTY, tofind, wdata->pages);
+ folio_batch_init(&fbatch);
+ *found_pages = 0;
+
+again:
+ nr = filemap_get_folios_tag(mapping, index, end,
+ PAGECACHE_TAG_DIRTY, &fbatch);
+ if (!nr)
+ goto out; /* No dirty pages left in the range */
+
+ for (i = 0; i < nr; i++) {
+ struct folio *folio = fbatch.folios[i];
+
+ idx = 0;
+ p = folio_nr_pages(folio);
+add_more:
+ wdata->pages[*found_pages] = folio_page(folio, idx);
+ if (++*found_pages == tofind) {
+ folio_batch_release(&fbatch);
+ goto out;
+ }
+ if (++idx < p) {
+ folio_ref_inc(folio);
+ goto add_more;
+ }
+ }
+ folio_batch_release(&fbatch);
+ goto again;
+out:
return wdata;
}

--
2.36.1

2022-09-01 22:05:12

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 10/23] ext4: Convert mpage_prepare_extent_to_map() to use filemap_get_folios_tag()

Converted the function to use folios throughout. This is in preparation
for the removal of find_get_pages_range_tag(). Now supports large
folios.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/ext4/inode.c | 55 ++++++++++++++++++++++++-------------------------
1 file changed, 27 insertions(+), 28 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 601214453c3a..fbd876e10a85 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2565,8 +2565,8 @@ static int ext4_da_writepages_trans_blocks(struct inode *inode)
static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
{
struct address_space *mapping = mpd->inode->i_mapping;
- struct pagevec pvec;
- unsigned int nr_pages;
+ struct folio_batch fbatch;
+ unsigned int nr_folios;
long left = mpd->wbc->nr_to_write;
pgoff_t index = mpd->first_page;
pgoff_t end = mpd->last_page;
@@ -2580,18 +2580,17 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
tag = PAGECACHE_TAG_TOWRITE;
else
tag = PAGECACHE_TAG_DIRTY;
-
- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
mpd->map.m_len = 0;
mpd->next_page = index;
while (index <= end) {
- nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
- tag);
- if (nr_pages == 0)
+ nr_folios = filemap_get_folios_tag(mapping, &index, end,
+ tag, &fbatch);
+ if (nr_folios == 0)
break;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch.folios[i];

/*
* Accumulated enough dirty pages? This doesn't apply
@@ -2605,10 +2604,10 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
goto out;

/* If we can't merge this page, we are done. */
- if (mpd->map.m_len > 0 && mpd->next_page != page->index)
+ if (mpd->map.m_len > 0 && mpd->next_page != folio->index)
goto out;

- lock_page(page);
+ folio_lock(folio);
/*
* If the page is no longer dirty, or its mapping no
* longer corresponds to inode we are writing (which
@@ -2616,16 +2615,16 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
* page is already under writeback and we are not doing
* a data integrity writeback, skip the page
*/
- if (!PageDirty(page) ||
- (PageWriteback(page) &&
+ if (!folio_test_dirty(folio) ||
+ (folio_test_writeback(folio) &&
(mpd->wbc->sync_mode == WB_SYNC_NONE)) ||
- unlikely(page->mapping != mapping)) {
- unlock_page(page);
+ unlikely(folio->mapping != mapping)) {
+ folio_unlock(folio);
continue;
}

- wait_on_page_writeback(page);
- BUG_ON(PageWriteback(page));
+ folio_wait_writeback(folio);
+ BUG_ON(folio_test_writeback(folio));

/*
* Should never happen but for buggy code in
@@ -2636,33 +2635,33 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
*
* [1] https://lore.kernel.org/linux-mm/[email protected]
*/
- if (!page_has_buffers(page)) {
- ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index);
- ClearPageDirty(page);
- unlock_page(page);
+ if (!folio_buffers(folio)) {
+ ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", folio->index);
+ folio_clear_dirty(folio);
+ folio_unlock(folio);
continue;
}

if (mpd->map.m_len == 0)
- mpd->first_page = page->index;
- mpd->next_page = page->index + 1;
+ mpd->first_page = folio->index;
+ mpd->next_page = folio->index + folio_nr_pages(folio);
/* Add all dirty buffers to mpd */
- lblk = ((ext4_lblk_t)page->index) <<
+ lblk = ((ext4_lblk_t)folio->index) <<
(PAGE_SHIFT - blkbits);
- head = page_buffers(page);
+ head = folio_buffers(folio);
err = mpage_process_page_bufs(mpd, head, head, lblk);
if (err <= 0)
goto out;
err = 0;
- left--;
+ left -= folio_nr_pages(folio);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
mpd->scanned_until_end = 1;
return 0;
out:
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
return err;
}

--
2.36.1

2022-09-01 22:05:35

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 12/23] f2fs: Convert f2fs_flush_inline_data() to use filemap_get_folios_tag()

Convert function to use folios. This is in preparation for the removal
of find_get_pages_tag(). Does NOT support large folios.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/f2fs/node.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index a3c5eedfcf64..c2b54c58392a 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1885,17 +1885,18 @@ static bool flush_dirty_inode(struct page *page)
void f2fs_flush_inline_data(struct f2fs_sb_info *sbi)
{
pgoff_t index = 0;
- struct pagevec pvec;
- int nr_pages;
+ struct folio_batch fbatch;
+ int nr_folios;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);

- while ((nr_pages = pagevec_lookup_tag(&pvec,
- NODE_MAPPING(sbi), &index, PAGECACHE_TAG_DIRTY))) {
+ while ((nr_folios = filemap_get_folios_tag(NODE_MAPPING(sbi), &index,
+ (pgoff_t)-1, PAGECACHE_TAG_DIRTY,
+ &fbatch))) {
int i;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct page *page = &fbatch.folios[i]->page;

if (!IS_DNODE(page))
continue;
@@ -1922,7 +1923,7 @@ void f2fs_flush_inline_data(struct f2fs_sb_info *sbi)
}
unlock_page(page);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
}
--
2.36.1

2022-09-01 22:05:41

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 07/23] btrfs: Convert extent_write_cache_pages() to use filemap_get_folios_tag()

Converted function to use folios throughout. This is in preparation for
the removal of find_get_pages_range_tag(). Now also supports large
folios.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/btrfs/extent_io.c | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index d1fa072bfdd0..80fe313f8461 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4972,8 +4972,8 @@ static int extent_write_cache_pages(struct address_space *mapping,
int ret = 0;
int done = 0;
int nr_to_write_done = 0;
- struct pagevec pvec;
- int nr_pages;
+ struct folio_batch fbatch;
+ unsigned int nr_folios;
pgoff_t index;
pgoff_t end; /* Inclusive */
pgoff_t done_index;
@@ -4993,7 +4993,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
if (!igrab(inode))
return 0;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
if (wbc->range_cyclic) {
index = mapping->writeback_index; /* Start from prev offset */
end = -1;
@@ -5031,14 +5031,14 @@ static int extent_write_cache_pages(struct address_space *mapping,
tag_pages_for_writeback(mapping, index, end);
done_index = index;
while (!done && !nr_to_write_done && (index <= end) &&
- (nr_pages = pagevec_lookup_range_tag(&pvec, mapping,
- &index, end, tag))) {
+ (nr_folios = filemap_get_folios_tag(mapping, &index,
+ end, tag, &fbatch))) {
unsigned i;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch.folios[i];

- done_index = page->index + 1;
+ done_index = folio->index + folio_nr_pages(folio);
/*
* At this point we hold neither the i_pages lock nor
* the page lock: the page may be truncated or
@@ -5046,29 +5046,29 @@ static int extent_write_cache_pages(struct address_space *mapping,
* or even swizzled back from swapper_space to
* tmpfs file mapping
*/
- if (!trylock_page(page)) {
+ if (!folio_trylock(folio)) {
submit_write_bio(epd, 0);
- lock_page(page);
+ folio_lock(folio);
}

- if (unlikely(page->mapping != mapping)) {
- unlock_page(page);
+ if (unlikely(folio->mapping != mapping)) {
+ folio_unlock(folio);
continue;
}

if (wbc->sync_mode != WB_SYNC_NONE) {
- if (PageWriteback(page))
+ if (folio_test_writeback(folio))
submit_write_bio(epd, 0);
- wait_on_page_writeback(page);
+ folio_wait_writeback(folio);
}

- if (PageWriteback(page) ||
- !clear_page_dirty_for_io(page)) {
- unlock_page(page);
+ if (folio_test_writeback(folio) ||
+ !folio_clear_dirty_for_io(folio)) {
+ folio_unlock(folio);
continue;
}

- ret = __extent_writepage(page, wbc, epd);
+ ret = __extent_writepage(&folio->page, wbc, epd);
if (ret < 0) {
done = 1;
break;
@@ -5081,7 +5081,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
*/
nr_to_write_done = wbc->nr_to_write <= 0;
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
if (!scanned && !done) {
--
2.36.1

2022-09-01 22:05:46

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 11/23] f2fs: Convert f2fs_fsync_node_pages() to use filemap_get_folios_tag()

Convert function to use folios. This is in preparation for the removal
of find_get_pages_range_tag(). Does NOT support large
folios.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/f2fs/node.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index e06a0c478b39..a3c5eedfcf64 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1726,12 +1726,12 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode,
unsigned int *seq_id)
{
pgoff_t index;
- struct pagevec pvec;
+ struct folio_batch fbatch;
int ret = 0;
struct page *last_page = NULL;
bool marked = false;
nid_t ino = inode->i_ino;
- int nr_pages;
+ int nr_folios;
int nwritten = 0;

if (atomic) {
@@ -1740,20 +1740,21 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode,
return PTR_ERR_OR_ZERO(last_page);
}
retry:
- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
index = 0;

- while ((nr_pages = pagevec_lookup_tag(&pvec, NODE_MAPPING(sbi), &index,
- PAGECACHE_TAG_DIRTY))) {
+ while ((nr_folios = filemap_get_folios_tag(NODE_MAPPING(sbi), &index,
+ (pgoff_t)-1, PAGECACHE_TAG_DIRTY,
+ &fbatch))) {
int i;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct page *page = &fbatch.folios[i]->page;
bool submitted = false;

if (unlikely(f2fs_cp_error(sbi))) {
f2fs_put_page(last_page, 0);
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
ret = -EIO;
goto out;
}
@@ -1819,7 +1820,7 @@ int f2fs_fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode,
break;
}
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();

if (ret || marked)
--
2.36.1

2022-09-01 22:05:53

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 14/23] f2fs: Convert f2fs_write_cache_pages() to use filemap_get_folios_tag()

Converted the function to use folios. This is in preparation for the
removal of find_get_pages_range_tag().

Also modified f2fs_all_cluster_page_ready to take in a folio_batch instead
of pagevec. This does NOT support large folios. The function currently
only utilizes folios of size 1 so this shouldn't cause any issues right
now.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/f2fs/compress.c | 13 ++++-----
fs/f2fs/data.c | 67 +++++++++++++++++++++++++---------------------
fs/f2fs/f2fs.h | 5 ++--
3 files changed, 46 insertions(+), 39 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 70e97075e535..e1bd2e859f64 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -841,10 +841,11 @@ bool f2fs_cluster_can_merge_page(struct compress_ctx *cc, pgoff_t index)
return is_page_in_cluster(cc, index);
}

-bool f2fs_all_cluster_page_ready(struct compress_ctx *cc, struct page **pages,
- int index, int nr_pages, bool uptodate)
+bool f2fs_all_cluster_page_ready(struct compress_ctx *cc,
+ struct folio_batch *fbatch,
+ int index, int nr_folios, bool uptodate)
{
- unsigned long pgidx = pages[index]->index;
+ unsigned long pgidx = fbatch->folios[index]->index;
int i = uptodate ? 0 : 1;

/*
@@ -854,13 +855,13 @@ bool f2fs_all_cluster_page_ready(struct compress_ctx *cc, struct page **pages,
if (uptodate && (pgidx % cc->cluster_size))
return false;

- if (nr_pages - index < cc->cluster_size)
+ if (nr_folios - index < cc->cluster_size)
return false;

for (; i < cc->cluster_size; i++) {
- if (pages[index + i]->index != pgidx + i)
+ if (fbatch->folios[index + i]->index != pgidx + i)
return false;
- if (uptodate && !PageUptodate(pages[index + i]))
+ if (uptodate && !folio_test_uptodate(fbatch->folios[index + i]))
return false;
}

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index aa3ccddfa037..f87b9644b10b 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2917,7 +2917,7 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
{
int ret = 0;
int done = 0, retry = 0;
- struct page *pages[F2FS_ONSTACK_PAGES];
+ struct folio_batch fbatch;
struct f2fs_sb_info *sbi = F2FS_M_SB(mapping);
struct bio *bio = NULL;
sector_t last_block;
@@ -2938,7 +2938,7 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
.private = NULL,
};
#endif
- int nr_pages;
+ int nr_folios;
pgoff_t index;
pgoff_t end; /* Inclusive */
pgoff_t done_index;
@@ -2948,6 +2948,8 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
int submitted = 0;
int i;

+ folio_batch_init(&fbatch);
+
if (get_dirty_pages(mapping->host) <=
SM_I(F2FS_M_SB(mapping))->min_hot_blocks)
set_inode_flag(mapping->host, FI_HOT_DATA);
@@ -2973,13 +2975,13 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
tag_pages_for_writeback(mapping, index, end);
done_index = index;
while (!done && !retry && (index <= end)) {
- nr_pages = find_get_pages_range_tag(mapping, &index, end,
- tag, F2FS_ONSTACK_PAGES, pages);
- if (nr_pages == 0)
+ nr_folios = filemap_get_folios_tag(mapping, &index, end,
+ tag, &fbatch);
+ if (nr_folios == 0)
break;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch.folios[i];
bool need_readd;
readd:
need_readd = false;
@@ -2996,7 +2998,7 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
}

if (!f2fs_cluster_can_merge_page(&cc,
- page->index)) {
+ folio->index)) {
ret = f2fs_write_multi_pages(&cc,
&submitted, wbc, io_type);
if (!ret)
@@ -3005,27 +3007,28 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
}

if (unlikely(f2fs_cp_error(sbi)))
- goto lock_page;
+ goto lock_folio;

if (!f2fs_cluster_is_empty(&cc))
- goto lock_page;
+ goto lock_folio;

if (f2fs_all_cluster_page_ready(&cc,
- pages, i, nr_pages, true))
+ &fbatch, i, nr_pages, true))
goto lock_page;

ret2 = f2fs_prepare_compress_overwrite(
inode, &pagep,
- page->index, &fsdata);
+ folio->index, &fsdata);
if (ret2 < 0) {
ret = ret2;
done = 1;
break;
} else if (ret2 &&
(!f2fs_compress_write_end(inode,
- fsdata, page->index, 1) ||
+ fsdata, folio->index, 1) ||
!f2fs_all_cluster_page_ready(&cc,
- pages, i, nr_pages, false))) {
+ &fbatch, i, nr_folios,
+ false))) {
retry = 1;
break;
}
@@ -3038,46 +3041,47 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
break;
}
#ifdef CONFIG_F2FS_FS_COMPRESSION
-lock_page:
+lock_folio:
#endif
- done_index = page->index;
+ done_index = folio->index;
retry_write:
- lock_page(page);
+ folio_lock(folio);

- if (unlikely(page->mapping != mapping)) {
+ if (unlikely(folio->mapping != mapping)) {
continue_unlock:
- unlock_page(page);
+ folio_unlock(folio);
continue;
}

- if (!PageDirty(page)) {
+ if (!folio_test_dirty(folio)) {
/* someone wrote it for us */
goto continue_unlock;
}

- if (PageWriteback(page)) {
+ if (folio_test_writeback(folio)) {
if (wbc->sync_mode != WB_SYNC_NONE)
- f2fs_wait_on_page_writeback(page,
+ f2fs_wait_on_page_writeback(
+ &folio->page,
DATA, true, true);
else
goto continue_unlock;
}

- if (!clear_page_dirty_for_io(page))
+ if (!folio_clear_dirty_for_io(folio))
goto continue_unlock;

#ifdef CONFIG_F2FS_FS_COMPRESSION
if (f2fs_compressed_file(inode)) {
- get_page(page);
- f2fs_compress_ctx_add_page(&cc, page);
+ folio_get(folio);
+ f2fs_compress_ctx_add_page(&cc, &folio->page);
continue;
}
#endif
- ret = f2fs_write_single_data_page(page, &submitted,
- &bio, &last_block, wbc, io_type,
- 0, true);
+ ret = f2fs_write_single_data_page(&folio->page,
+ &submitted, &bio, &last_block,
+ wbc, io_type, 0, true);
if (ret == AOP_WRITEPAGE_ACTIVATE)
- unlock_page(page);
+ folio_unlock(folio);
#ifdef CONFIG_F2FS_FS_COMPRESSION
result:
#endif
@@ -3101,7 +3105,8 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
}
goto next;
}
- done_index = page->index + 1;
+ done_index = folio->index +
+ folio_nr_pages(folio);
done = 1;
break;
}
@@ -3115,7 +3120,7 @@ static int f2fs_write_cache_pages(struct address_space *mapping,
if (need_readd)
goto readd;
}
- release_pages(pages, nr_pages);
+ folio_batch_release(&fbatch);
cond_resched();
}
#ifdef CONFIG_F2FS_FS_COMPRESSION
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 3c7cdb70fe2e..dcb28240f724 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4196,8 +4196,9 @@ void f2fs_end_read_compressed_page(struct page *page, bool failed,
block_t blkaddr, bool in_task);
bool f2fs_cluster_is_empty(struct compress_ctx *cc);
bool f2fs_cluster_can_merge_page(struct compress_ctx *cc, pgoff_t index);
-bool f2fs_all_cluster_page_ready(struct compress_ctx *cc, struct page **pages,
- int index, int nr_pages, bool uptodate);
+bool f2fs_all_cluster_page_ready(struct compress_ctx *cc,
+ struct folio_batch *fbatch, int index, int nr_folios,
+ bool uptodate);
bool f2fs_sanity_check_cluster(struct dnode_of_data *dn);
void f2fs_compress_ctx_add_page(struct compress_ctx *cc, struct page *page);
int f2fs_write_multi_pages(struct compress_ctx *cc,
--
2.36.1

2022-09-01 22:06:22

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 13/23] f2fs: Convert f2fs_sync_node_pages() to use filemap_get_folios_tag()

Convert function to use folios. This is in preparation for the removal
of find_get_pages_range_tag(). Does NOT support large folios.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/f2fs/node.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index c2b54c58392a..cf8665f04c0d 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1933,23 +1933,24 @@ int f2fs_sync_node_pages(struct f2fs_sb_info *sbi,
bool do_balance, enum iostat_type io_type)
{
pgoff_t index;
- struct pagevec pvec;
+ struct folio_batch fbatch;
int step = 0;
int nwritten = 0;
int ret = 0;
- int nr_pages, done = 0;
+ int nr_folios, done = 0;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);

next_step:
index = 0;

- while (!done && (nr_pages = pagevec_lookup_tag(&pvec,
- NODE_MAPPING(sbi), &index, PAGECACHE_TAG_DIRTY))) {
+ while (!done && (nr_folios = filemap_get_folios_tag(NODE_MAPPING(sbi),
+ &index, (pgoff_t)-1, PAGECACHE_TAG_DIRTY,
+ &fbatch))) {
int i;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct page *page = &fbatch.folios[i]->page;
bool submitted = false;

/* give a priority to WB_SYNC threads */
@@ -2024,7 +2025,7 @@ int f2fs_sync_node_pages(struct f2fs_sb_info *sbi,
if (--wbc->nr_to_write == 0)
break;
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();

if (wbc->nr_to_write == 0) {
--
2.36.1

2022-09-01 22:06:31

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 20/23] nilfs2: Convert nilfs_btree_lookup_dirty_buffers() to use filemap_get_folios_tag()

Convert function to use folios throughout. This is in preparation for
the removal of find_get_pages_range_tag().

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/nilfs2/btree.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c
index 9f4d9432d38a..1e26f32a4e36 100644
--- a/fs/nilfs2/btree.c
+++ b/fs/nilfs2/btree.c
@@ -2143,7 +2143,7 @@ static void nilfs_btree_lookup_dirty_buffers(struct nilfs_bmap *btree,
struct inode *btnc_inode = NILFS_BMAP_I(btree)->i_assoc_inode;
struct address_space *btcache = btnc_inode->i_mapping;
struct list_head lists[NILFS_BTREE_LEVEL_MAX];
- struct pagevec pvec;
+ struct folio_batch fbatch;
struct buffer_head *bh, *head;
pgoff_t index = 0;
int level, i;
@@ -2153,19 +2153,19 @@ static void nilfs_btree_lookup_dirty_buffers(struct nilfs_bmap *btree,
level++)
INIT_LIST_HEAD(&lists[level]);

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);

- while (pagevec_lookup_tag(&pvec, btcache, &index,
- PAGECACHE_TAG_DIRTY)) {
- for (i = 0; i < pagevec_count(&pvec); i++) {
- bh = head = page_buffers(pvec.pages[i]);
+ while (filemap_get_folios_tag(btcache, &index, (pgoff_t)-1,
+ PAGECACHE_TAG_DIRTY, &fbatch)) {
+ for (i = 0; i < folio_batch_count(&fbatch); i++) {
+ bh = head = folio_buffers(fbatch.folios[i]);
do {
if (buffer_dirty(bh))
nilfs_btree_add_dirty_buffer(btree,
lists, bh);
} while ((bh = bh->b_this_page) != head);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}

--
2.36.1

2022-09-01 22:07:02

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 15/23] f2fs: Convert last_fsync_dnode() to use filemap_get_folios_tag()

Convert to use folios. This is in preparation for the removal of
find_get_pages_range_tag().

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/f2fs/node.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index cf8665f04c0d..b993be76013e 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1513,23 +1513,24 @@ static void flush_inline_data(struct f2fs_sb_info *sbi, nid_t ino)
static struct page *last_fsync_dnode(struct f2fs_sb_info *sbi, nid_t ino)
{
pgoff_t index;
- struct pagevec pvec;
+ struct folio_batch fbatch;
struct page *last_page = NULL;
- int nr_pages;
+ int nr_folios;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
index = 0;

- while ((nr_pages = pagevec_lookup_tag(&pvec, NODE_MAPPING(sbi), &index,
- PAGECACHE_TAG_DIRTY))) {
+ while ((nr_folios = filemap_get_folios_tag(NODE_MAPPING(sbi), &index,
+ (pgoff_t)-1, PAGECACHE_TAG_DIRTY,
+ &fbatch))) {
int i;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct page *page = &fbatch.folios[i]->page;

if (unlikely(f2fs_cp_error(sbi))) {
f2fs_put_page(last_page, 0);
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
return ERR_PTR(-EIO);
}

@@ -1560,7 +1561,7 @@ static struct page *last_fsync_dnode(struct f2fs_sb_info *sbi, nid_t ino)
last_page = page;
unlock_page(page);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
return last_page;
--
2.36.1

2022-09-01 22:07:03

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 23/23] filemap: Remove find_get_pages_range_tag()

All callers to find_get_pages_range_tag(), find_get_pages_tag(),
pagevec_lookup_range_tag(), and pagevec_lookup_tag() have been removed.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
include/linux/pagemap.h | 10 -------
include/linux/pagevec.h | 8 ------
mm/filemap.c | 60 -----------------------------------------
mm/swap.c | 10 -------
4 files changed, 88 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 85cc96c82c2c..b8ea33751a66 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -742,16 +742,6 @@ unsigned find_get_pages_contig(struct address_space *mapping, pgoff_t start,
unsigned int nr_pages, struct page **pages);
unsigned filemap_get_folios_tag(struct address_space *mapping, pgoff_t *start,
pgoff_t end, xa_mark_t tag, struct folio_batch *fbatch);
-unsigned find_get_pages_range_tag(struct address_space *mapping, pgoff_t *index,
- pgoff_t end, xa_mark_t tag, unsigned int nr_pages,
- struct page **pages);
-static inline unsigned find_get_pages_tag(struct address_space *mapping,
- pgoff_t *index, xa_mark_t tag, unsigned int nr_pages,
- struct page **pages)
-{
- return find_get_pages_range_tag(mapping, index, (pgoff_t)-1, tag,
- nr_pages, pages);
-}

struct page *grab_cache_page_write_begin(struct address_space *mapping,
pgoff_t index);
diff --git a/include/linux/pagevec.h b/include/linux/pagevec.h
index 215eb6c3bdc9..a520632297ac 100644
--- a/include/linux/pagevec.h
+++ b/include/linux/pagevec.h
@@ -26,14 +26,6 @@ struct pagevec {
};

void __pagevec_release(struct pagevec *pvec);
-unsigned pagevec_lookup_range_tag(struct pagevec *pvec,
- struct address_space *mapping, pgoff_t *index, pgoff_t end,
- xa_mark_t tag);
-static inline unsigned pagevec_lookup_tag(struct pagevec *pvec,
- struct address_space *mapping, pgoff_t *index, xa_mark_t tag)
-{
- return pagevec_lookup_range_tag(pvec, mapping, index, (pgoff_t)-1, tag);
-}

static inline void pagevec_init(struct pagevec *pvec)
{
diff --git a/mm/filemap.c b/mm/filemap.c
index 435fc53b3f2f..b986f246a6ae 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2309,66 +2309,6 @@ unsigned filemap_get_folios_tag(struct address_space *mapping, pgoff_t *start,
}
EXPORT_SYMBOL(filemap_get_folios_tag);

-/**
- * find_get_pages_range_tag - Find and return head pages matching @tag.
- * @mapping: the address_space to search
- * @index: the starting page index
- * @end: The final page index (inclusive)
- * @tag: the tag index
- * @nr_pages: the maximum number of pages
- * @pages: where the resulting pages are placed
- *
- * Like find_get_pages_range(), except we only return head pages which are
- * tagged with @tag. @index is updated to the index immediately after the
- * last page we return, ready for the next iteration.
- *
- * Return: the number of pages which were found.
- */
-unsigned find_get_pages_range_tag(struct address_space *mapping, pgoff_t *index,
- pgoff_t end, xa_mark_t tag, unsigned int nr_pages,
- struct page **pages)
-{
- XA_STATE(xas, &mapping->i_pages, *index);
- struct folio *folio;
- unsigned ret = 0;
-
- if (unlikely(!nr_pages))
- return 0;
-
- rcu_read_lock();
- while ((folio = find_get_entry(&xas, end, tag))) {
- /*
- * Shadow entries should never be tagged, but this iteration
- * is lockless so there is a window for page reclaim to evict
- * a page we saw tagged. Skip over it.
- */
- if (xa_is_value(folio))
- continue;
-
- pages[ret] = &folio->page;
- if (++ret == nr_pages) {
- *index = folio->index + folio_nr_pages(folio);
- goto out;
- }
- }
-
- /*
- * We come here when we got to @end. We take care to not overflow the
- * index @index as it confuses some of the callers. This breaks the
- * iteration when there is a page at index -1 but that is already
- * broken anyway.
- */
- if (end == (pgoff_t)-1)
- *index = (pgoff_t)-1;
- else
- *index = end + 1;
-out:
- rcu_read_unlock();
-
- return ret;
-}
-EXPORT_SYMBOL(find_get_pages_range_tag);
-
/*
* CD/DVDs are error prone. When a medium error occurs, the driver may fail
* a _large_ part of the i/o request. Imagine the worst scenario:
diff --git a/mm/swap.c b/mm/swap.c
index 9cee7f6a3809..7b8c1c8024a1 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -1055,16 +1055,6 @@ void folio_batch_remove_exceptionals(struct folio_batch *fbatch)
fbatch->nr = j;
}

-unsigned pagevec_lookup_range_tag(struct pagevec *pvec,
- struct address_space *mapping, pgoff_t *index, pgoff_t end,
- xa_mark_t tag)
-{
- pvec->nr = find_get_pages_range_tag(mapping, index, end, tag,
- PAGEVEC_SIZE, pvec->pages);
- return pagevec_count(pvec);
-}
-EXPORT_SYMBOL(pagevec_lookup_range_tag);
-
/*
* Perform any setup for the swap system
*/
--
2.36.1

2022-09-01 22:07:10

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 16/23] f2fs: Convert f2fs_sync_meta_pages() to use filemap_get_folios_tag()

Convert function to use folios. This is in preparation for the removal
of find_get_pages_range_tag().

Initially the function was checking if the previous page index is truly the
previous page i.e. 1 index behind the current page. To convert to folios and
maintain this check we need to make the check
folio->index != prev + folio_nr_pages(previous folio) since we don't know
how many pages are in a folio.

At index i == 0 the check is guaranteed to succeed, so to workaround indexing
bounds we can simply ignore the check for that specific index. This makes the
initial assignment of prev trivial, so I removed that as well.

Also modified a comment in commit_checkpoint for consistency.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/f2fs/checkpoint.c | 49 +++++++++++++++++++++++---------------------
1 file changed, 26 insertions(+), 23 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 8259e0fa97e1..9f6694f7d723 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -377,59 +377,62 @@ long f2fs_sync_meta_pages(struct f2fs_sb_info *sbi, enum page_type type,
{
struct address_space *mapping = META_MAPPING(sbi);
pgoff_t index = 0, prev = ULONG_MAX;
- struct pagevec pvec;
+ struct folio_batch fbatch;
long nwritten = 0;
- int nr_pages;
+ int nr_folios;
struct writeback_control wbc = {
.for_reclaim = 0,
};
struct blk_plug plug;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);

blk_start_plug(&plug);

- while ((nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
- PAGECACHE_TAG_DIRTY))) {
+ while ((nr_folios = filemap_get_folios_tag(mapping, &index,
+ (pgoff_t)-1,
+ PAGECACHE_TAG_DIRTY, &fbatch))) {
int i;

- for (i = 0; i < nr_pages; i++) {
- struct page *page = pvec.pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch.folios[i];

- if (prev == ULONG_MAX)
- prev = page->index - 1;
- if (nr_to_write != LONG_MAX && page->index != prev + 1) {
- pagevec_release(&pvec);
+ if (nr_to_write != LONG_MAX && i != 0 &&
+ folio->index != prev +
+ folio_nr_pages(fbatch.folios[i-1])) {
+ folio_batch_release(&fbatch);
goto stop;
}

- lock_page(page);
+ folio_lock(folio);

- if (unlikely(page->mapping != mapping)) {
+ if (unlikely(folio->mapping != mapping)) {
continue_unlock:
- unlock_page(page);
+ folio_unlock(folio);
continue;
}
- if (!PageDirty(page)) {
+ if (!folio_test_dirty(folio)) {
/* someone wrote it for us */
goto continue_unlock;
}

- f2fs_wait_on_page_writeback(page, META, true, true);
+ f2fs_wait_on_page_writeback(&folio->page, META,
+ true, true);

- if (!clear_page_dirty_for_io(page))
+ if (!folio_clear_dirty_for_io(folio))
goto continue_unlock;

- if (__f2fs_write_meta_page(page, &wbc, io_type)) {
- unlock_page(page);
+ if (__f2fs_write_meta_page(&folio->page, &wbc,
+ io_type)) {
+ folio_unlock(folio);
break;
}
- nwritten++;
- prev = page->index;
+ nwritten += folio_nr_pages(folio);
+ prev = folio->index;
if (unlikely(nwritten >= nr_to_write))
break;
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
stop:
@@ -1381,7 +1384,7 @@ static void commit_checkpoint(struct f2fs_sb_info *sbi,
};

/*
- * pagevec_lookup_tag and lock_page again will take
+ * filemap_get_folios_tag and lock_page again will take
* some extra time. Therefore, f2fs_update_meta_pages and
* f2fs_sync_meta_pages are combined in this function.
*/
--
2.36.1

2022-09-01 22:07:15

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 18/23] nilfs2: Convert nilfs_lookup_dirty_data_buffers() to use filemap_get_folios_tag()

Convert function to use folios throughout. This is in preparation for
the removal of find_get_pages_range_tag().

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/nilfs2/segment.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 0afe0832c754..e95c667bdc8f 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -680,7 +680,7 @@ static size_t nilfs_lookup_dirty_data_buffers(struct inode *inode,
loff_t start, loff_t end)
{
struct address_space *mapping = inode->i_mapping;
- struct pagevec pvec;
+ struct folio_batch fbatch;
pgoff_t index = 0, last = ULONG_MAX;
size_t ndirties = 0;
int i;
@@ -694,23 +694,26 @@ static size_t nilfs_lookup_dirty_data_buffers(struct inode *inode,
index = start >> PAGE_SHIFT;
last = end >> PAGE_SHIFT;
}
- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
repeat:
if (unlikely(index > last) ||
- !pagevec_lookup_range_tag(&pvec, mapping, &index, last,
- PAGECACHE_TAG_DIRTY))
+ !filemap_get_folios_tag(mapping, &index, last,
+ PAGECACHE_TAG_DIRTY, &fbatch))
return ndirties;

- for (i = 0; i < pagevec_count(&pvec); i++) {
+ for (i = 0; i < folio_batch_count(&fbatch); i++) {
struct buffer_head *bh, *head;
- struct page *page = pvec.pages[i];
+ struct folio *folio = fbatch.folios[i];

- lock_page(page);
- if (!page_has_buffers(page))
- create_empty_buffers(page, i_blocksize(inode), 0);
- unlock_page(page);
+ head = folio_buffers(folio);
+ folio_lock(folio);
+ if (!head) {
+ create_empty_buffers(&folio->page, i_blocksize(inode), 0);
+ head = folio_buffers(folio);
+ }
+ folio_unlock(folio);

- bh = head = page_buffers(page);
+ bh = head;
do {
if (!buffer_dirty(bh) || buffer_async_write(bh))
continue;
@@ -718,13 +721,13 @@ static size_t nilfs_lookup_dirty_data_buffers(struct inode *inode,
list_add_tail(&bh->b_assoc_buffers, listp);
ndirties++;
if (unlikely(ndirties >= nlimit)) {
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
return ndirties;
}
} while (bh = bh->b_this_page, bh != head);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
goto repeat;
}
--
2.36.1

2022-09-01 22:07:28

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 17/23] gfs2: Convert gfs2_write_cache_jdata() to use filemap_get_folios_tag()

Converted function to use folios throughout. This is in preparation for
the removal of find_get_pgaes_range_tag().

Also had to modify and rename gfs2_write_jdata_pagevec() to take in
and utilize folio_batch rather than pagevec and use folios rather
than pages. gfs2_write_jdata_batch() now supports large folios.

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/gfs2/aops.c | 64 +++++++++++++++++++++++++++-----------------------
1 file changed, 35 insertions(+), 29 deletions(-)

diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
index 05bee80ac7de..8f87c2551a3d 100644
--- a/fs/gfs2/aops.c
+++ b/fs/gfs2/aops.c
@@ -195,67 +195,71 @@ static int gfs2_writepages(struct address_space *mapping,
}

/**
- * gfs2_write_jdata_pagevec - Write back a pagevec's worth of pages
+ * gfs2_write_jdata_batch - Write back a folio batch's worth of folios
* @mapping: The mapping
* @wbc: The writeback control
- * @pvec: The vector of pages
- * @nr_pages: The number of pages to write
+ * @fbatch: The batch of folios
* @done_index: Page index
*
* Returns: non-zero if loop should terminate, zero otherwise
*/

-static int gfs2_write_jdata_pagevec(struct address_space *mapping,
+static int gfs2_write_jdata_batch(struct address_space *mapping,
struct writeback_control *wbc,
- struct pagevec *pvec,
- int nr_pages,
+ struct folio_batch *fbatch,
pgoff_t *done_index)
{
struct inode *inode = mapping->host;
struct gfs2_sbd *sdp = GFS2_SB(inode);
- unsigned nrblocks = nr_pages * (PAGE_SIZE >> inode->i_blkbits);
+ unsigned nrblocks;
int i;
int ret;
+ int nr_pages = 0;
+ int nr_folios = folio_batch_count(fbatch);
+
+ for (i = 0; i < nr_folios; i++)
+ nr_pages += folio_nr_pages(fbatch->folios[i]);
+ nrblocks = nr_pages * (PAGE_SIZE >> inode->i_blkbits);

ret = gfs2_trans_begin(sdp, nrblocks, nrblocks);
if (ret < 0)
return ret;

- for(i = 0; i < nr_pages; i++) {
- struct page *page = pvec->pages[i];
+ for (i = 0; i < nr_folios; i++) {
+ struct folio *folio = fbatch->folios[i];

- *done_index = page->index;
+ *done_index = folio->index;

- lock_page(page);
+ folio_lock(folio);

- if (unlikely(page->mapping != mapping)) {
+ if (unlikely(folio->mapping != mapping)) {
continue_unlock:
- unlock_page(page);
+ folio_unlock(folio);
continue;
}

- if (!PageDirty(page)) {
+ if (!folio_test_dirty(folio)) {
/* someone wrote it for us */
goto continue_unlock;
}

- if (PageWriteback(page)) {
+ if (folio_test_writeback(folio)) {
if (wbc->sync_mode != WB_SYNC_NONE)
- wait_on_page_writeback(page);
+ folio_wait_writeback(folio);
else
goto continue_unlock;
}

- BUG_ON(PageWriteback(page));
- if (!clear_page_dirty_for_io(page))
+ BUG_ON(folio_test_writeback(folio));
+ if (!folio_clear_dirty_for_io(folio))
goto continue_unlock;

trace_wbc_writepage(wbc, inode_to_bdi(inode));

- ret = __gfs2_jdata_writepage(page, wbc);
+ ret = __gfs2_jdata_writepage(&folio->page, wbc);
if (unlikely(ret)) {
if (ret == AOP_WRITEPAGE_ACTIVATE) {
- unlock_page(page);
+ folio_unlock(folio);
ret = 0;
} else {

@@ -268,7 +272,8 @@ static int gfs2_write_jdata_pagevec(struct address_space *mapping,
* not be suitable for data integrity
* writeout).
*/
- *done_index = page->index + 1;
+ *done_index = folio->index +
+ folio_nr_pages(folio);
ret = 1;
break;
}
@@ -305,8 +310,8 @@ static int gfs2_write_cache_jdata(struct address_space *mapping,
{
int ret = 0;
int done = 0;
- struct pagevec pvec;
- int nr_pages;
+ struct folio_batch fbatch;
+ int nr_folios;
pgoff_t writeback_index;
pgoff_t index;
pgoff_t end;
@@ -315,7 +320,7 @@ static int gfs2_write_cache_jdata(struct address_space *mapping,
int range_whole = 0;
xa_mark_t tag;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
if (wbc->range_cyclic) {
writeback_index = mapping->writeback_index; /* prev offset */
index = writeback_index;
@@ -341,17 +346,18 @@ static int gfs2_write_cache_jdata(struct address_space *mapping,
tag_pages_for_writeback(mapping, index, end);
done_index = index;
while (!done && (index <= end)) {
- nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
- tag);
- if (nr_pages == 0)
+ nr_folios = filemap_get_folios_tag(mapping, &index, end,
+ tag, &fbatch);
+ if (nr_folios == 0)
break;

- ret = gfs2_write_jdata_pagevec(mapping, wbc, &pvec, nr_pages, &done_index);
+ ret = gfs2_write_jdata_batch(mapping, wbc, &fbatch,
+ &done_index);
if (ret)
done = 1;
if (ret > 0)
ret = 0;
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}

--
2.36.1

2022-09-01 22:07:48

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 19/23] nilfs2: Convert nilfs_lookup_dirty_node_buffers() to use filemap_get_folios_tag()

Convert function to use folios throughout. This is in preparation for
the removal of find_get_pages_range_tag().

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/nilfs2/segment.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index e95c667bdc8f..d386d913e349 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -737,20 +737,19 @@ static void nilfs_lookup_dirty_node_buffers(struct inode *inode,
{
struct nilfs_inode_info *ii = NILFS_I(inode);
struct inode *btnc_inode = ii->i_assoc_inode;
- struct pagevec pvec;
+ struct folio_batch fbatch;
struct buffer_head *bh, *head;
unsigned int i;
pgoff_t index = 0;

if (!btnc_inode)
return;
+ folio_batch_init(&fbatch);

- pagevec_init(&pvec);
-
- while (pagevec_lookup_tag(&pvec, btnc_inode->i_mapping, &index,
- PAGECACHE_TAG_DIRTY)) {
- for (i = 0; i < pagevec_count(&pvec); i++) {
- bh = head = page_buffers(pvec.pages[i]);
+ while (filemap_get_folios_tag(btnc_inode->i_mapping, &index,
+ (pgoff_t)-1, PAGECACHE_TAG_DIRTY, &fbatch)) {
+ for (i = 0; i < folio_batch_count(&fbatch); i++) {
+ bh = head = folio_buffers(fbatch.folios[i]);
do {
if (buffer_dirty(bh) &&
!buffer_async_write(bh)) {
@@ -761,7 +760,7 @@ static void nilfs_lookup_dirty_node_buffers(struct inode *inode,
bh = bh->b_this_page;
} while (bh != head);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
}
--
2.36.1

2022-09-01 22:07:57

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 21/23] nilfs2: Convert nilfs_copy_dirty_pages() to use filemap_get_folios_tag()

Convert function to use folios throughout. This is in preparation for
the removal of find_get_pages_range_tag().

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/nilfs2/page.c | 39 ++++++++++++++++++++-------------------
1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/fs/nilfs2/page.c b/fs/nilfs2/page.c
index 3267e96c256c..5c96084e829f 100644
--- a/fs/nilfs2/page.c
+++ b/fs/nilfs2/page.c
@@ -240,42 +240,43 @@ static void nilfs_copy_page(struct page *dst, struct page *src, int copy_dirty)
int nilfs_copy_dirty_pages(struct address_space *dmap,
struct address_space *smap)
{
- struct pagevec pvec;
+ struct folio_batch fbatch;
unsigned int i;
pgoff_t index = 0;
int err = 0;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);
repeat:
- if (!pagevec_lookup_tag(&pvec, smap, &index, PAGECACHE_TAG_DIRTY))
+ if (!filemap_get_folios_tag(smap, &index, (pgoff_t)-1,
+ PAGECACHE_TAG_DIRTY, &fbatch))
return 0;

- for (i = 0; i < pagevec_count(&pvec); i++) {
- struct page *page = pvec.pages[i], *dpage;
+ for (i = 0; i < folio_batch_count(&fbatch); i++) {
+ struct folio *folio = fbatch.folios[i], *dfolio;

- lock_page(page);
- if (unlikely(!PageDirty(page)))
- NILFS_PAGE_BUG(page, "inconsistent dirty state");
+ folio_lock(folio);
+ if (unlikely(!folio_test_dirty(folio)))
+ NILFS_PAGE_BUG(&folio->page, "inconsistent dirty state");

- dpage = grab_cache_page(dmap, page->index);
- if (unlikely(!dpage)) {
+ dfolio = filemap_grab_folio(dmap, folio->index);
+ if (unlikely(!dfolio)) {
/* No empty page is added to the page cache */
err = -ENOMEM;
- unlock_page(page);
+ folio_unlock(folio);
break;
}
- if (unlikely(!page_has_buffers(page)))
- NILFS_PAGE_BUG(page,
+ if (unlikely(!folio_buffers(folio)))
+ NILFS_PAGE_BUG(&folio->page,
"found empty page in dat page cache");

- nilfs_copy_page(dpage, page, 1);
- __set_page_dirty_nobuffers(dpage);
+ nilfs_copy_page(&dfolio->page, &folio->page, 1);
+ filemap_dirty_folio(folio_mapping(dfolio), dfolio);

- unlock_page(dpage);
- put_page(dpage);
- unlock_page(page);
+ folio_unlock(dfolio);
+ folio_put(dfolio);
+ folio_unlock(folio);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();

if (likely(!err))
--
2.36.1

2022-09-01 22:20:11

by Vishal Moola

[permalink] [raw]
Subject: [PATCH 22/23] nilfs2: Convert nilfs_clear_dirty_pages() to use filemap_get_folios_tag()

Convert function to use folios throughout. This is in preparation for
the removal of find_get_pages_range_tag().

Signed-off-by: Vishal Moola (Oracle) <[email protected]>
---
fs/nilfs2/page.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/nilfs2/page.c b/fs/nilfs2/page.c
index 5c96084e829f..b66f4e988016 100644
--- a/fs/nilfs2/page.c
+++ b/fs/nilfs2/page.c
@@ -358,22 +358,22 @@ void nilfs_copy_back_pages(struct address_space *dmap,
*/
void nilfs_clear_dirty_pages(struct address_space *mapping, bool silent)
{
- struct pagevec pvec;
+ struct folio_batch fbatch;
unsigned int i;
pgoff_t index = 0;

- pagevec_init(&pvec);
+ folio_batch_init(&fbatch);

- while (pagevec_lookup_tag(&pvec, mapping, &index,
- PAGECACHE_TAG_DIRTY)) {
- for (i = 0; i < pagevec_count(&pvec); i++) {
- struct page *page = pvec.pages[i];
+ while (filemap_get_folios_tag(mapping, &index, (pgoff_t)-1,
+ PAGECACHE_TAG_DIRTY, &fbatch)) {
+ for (i = 0; i < folio_batch_count(&fbatch); i++) {
+ struct folio *folio = fbatch.folios[i];

- lock_page(page);
- nilfs_clear_dirty_page(page, silent);
- unlock_page(page);
+ folio_lock(folio);
+ nilfs_clear_dirty_page(&folio->page, silent);
+ folio_unlock(folio);
}
- pagevec_release(&pvec);
+ folio_batch_release(&fbatch);
cond_resched();
}
}
--
2.36.1

2022-09-02 13:58:37

by David Sterba

[permalink] [raw]
Subject: Re: [PATCH 07/23] btrfs: Convert extent_write_cache_pages() to use filemap_get_folios_tag()

On Thu, Sep 01, 2022 at 03:01:22PM -0700, Vishal Moola (Oracle) wrote:
> Converted function to use folios throughout. This is in preparation for
> the removal of find_get_pages_range_tag(). Now also supports large
> folios.
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>

Acked-by: David Sterba <[email protected]>

2022-09-02 14:00:00

by David Sterba

[permalink] [raw]
Subject: Re: [PATCH 06/23] btrfs: Convert btree_write_cache_pages() to use filemap_get_folio_tag()

On Thu, Sep 01, 2022 at 03:01:21PM -0700, Vishal Moola (Oracle) wrote:
> Converted function to use folios throughout. This is in preparation for
> the removal of find_get_pages_range_tag().
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>

Acked-by: David Sterba <[email protected]>

2022-09-02 20:17:44

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 14/23] f2fs: Convert f2fs_write_cache_pages() to use filemap_get_folios_tag()

Hi "Vishal,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on jaegeuk-f2fs/dev-test]
[also build test ERROR on kdave/for-next linus/master v6.0-rc3]
[cannot apply to ceph-client/for-linus next-20220901]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Vishal-Moola-Oracle/Convert-to-filemap_get_folios_tag/20220902-060430
base: https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev-test
config: hexagon-randconfig-r045-20220901 (https://download.01.org/0day-ci/archive/20220903/[email protected]/config)
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project c55b41d5199d2394dd6cdb8f52180d8b81d809d4)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/6c74320953cd3749db95f9f09c1fc7d044933635
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Vishal-Moola-Oracle/Convert-to-filemap_get_folios_tag/20220902-060430
git checkout 6c74320953cd3749db95f9f09c1fc7d044933635
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash fs/f2fs/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> fs/f2fs/data.c:3016:18: error: use of undeclared identifier 'nr_pages'; did you mean 'dir_pages'?
&fbatch, i, nr_pages, true))
^~~~~~~~
dir_pages
include/linux/pagemap.h:1404:29: note: 'dir_pages' declared here
static inline unsigned long dir_pages(struct inode *inode)
^
>> fs/f2fs/data.c:3017:11: error: use of undeclared label 'lock_page'
goto lock_page;
^
2 errors generated.


vim +3016 fs/f2fs/data.c

2908
2909 /*
2910 * This function was copied from write_cche_pages from mm/page-writeback.c.
2911 * The major change is making write step of cold data page separately from
2912 * warm/hot data page.
2913 */
2914 static int f2fs_write_cache_pages(struct address_space *mapping,
2915 struct writeback_control *wbc,
2916 enum iostat_type io_type)
2917 {
2918 int ret = 0;
2919 int done = 0, retry = 0;
2920 struct folio_batch fbatch;
2921 struct f2fs_sb_info *sbi = F2FS_M_SB(mapping);
2922 struct bio *bio = NULL;
2923 sector_t last_block;
2924 #ifdef CONFIG_F2FS_FS_COMPRESSION
2925 struct inode *inode = mapping->host;
2926 struct compress_ctx cc = {
2927 .inode = inode,
2928 .log_cluster_size = F2FS_I(inode)->i_log_cluster_size,
2929 .cluster_size = F2FS_I(inode)->i_cluster_size,
2930 .cluster_idx = NULL_CLUSTER,
2931 .rpages = NULL,
2932 .nr_rpages = 0,
2933 .cpages = NULL,
2934 .valid_nr_cpages = 0,
2935 .rbuf = NULL,
2936 .cbuf = NULL,
2937 .rlen = PAGE_SIZE * F2FS_I(inode)->i_cluster_size,
2938 .private = NULL,
2939 };
2940 #endif
2941 int nr_folios;
2942 pgoff_t index;
2943 pgoff_t end; /* Inclusive */
2944 pgoff_t done_index;
2945 int range_whole = 0;
2946 xa_mark_t tag;
2947 int nwritten = 0;
2948 int submitted = 0;
2949 int i;
2950
2951 folio_batch_init(&fbatch);
2952
2953 if (get_dirty_pages(mapping->host) <=
2954 SM_I(F2FS_M_SB(mapping))->min_hot_blocks)
2955 set_inode_flag(mapping->host, FI_HOT_DATA);
2956 else
2957 clear_inode_flag(mapping->host, FI_HOT_DATA);
2958
2959 if (wbc->range_cyclic) {
2960 index = mapping->writeback_index; /* prev offset */
2961 end = -1;
2962 } else {
2963 index = wbc->range_start >> PAGE_SHIFT;
2964 end = wbc->range_end >> PAGE_SHIFT;
2965 if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
2966 range_whole = 1;
2967 }
2968 if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
2969 tag = PAGECACHE_TAG_TOWRITE;
2970 else
2971 tag = PAGECACHE_TAG_DIRTY;
2972 retry:
2973 retry = 0;
2974 if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
2975 tag_pages_for_writeback(mapping, index, end);
2976 done_index = index;
2977 while (!done && !retry && (index <= end)) {
2978 nr_folios = filemap_get_folios_tag(mapping, &index, end,
2979 tag, &fbatch);
2980 if (nr_folios == 0)
2981 break;
2982
2983 for (i = 0; i < nr_folios; i++) {
2984 struct folio *folio = fbatch.folios[i];
2985 bool need_readd;
2986 readd:
2987 need_readd = false;
2988 #ifdef CONFIG_F2FS_FS_COMPRESSION
2989 if (f2fs_compressed_file(inode)) {
2990 void *fsdata = NULL;
2991 struct page *pagep;
2992 int ret2;
2993
2994 ret = f2fs_init_compress_ctx(&cc);
2995 if (ret) {
2996 done = 1;
2997 break;
2998 }
2999
3000 if (!f2fs_cluster_can_merge_page(&cc,
3001 folio->index)) {
3002 ret = f2fs_write_multi_pages(&cc,
3003 &submitted, wbc, io_type);
3004 if (!ret)
3005 need_readd = true;
3006 goto result;
3007 }
3008
3009 if (unlikely(f2fs_cp_error(sbi)))
3010 goto lock_folio;
3011
3012 if (!f2fs_cluster_is_empty(&cc))
3013 goto lock_folio;
3014
3015 if (f2fs_all_cluster_page_ready(&cc,
> 3016 &fbatch, i, nr_pages, true))
> 3017 goto lock_page;
3018
3019 ret2 = f2fs_prepare_compress_overwrite(
3020 inode, &pagep,
3021 folio->index, &fsdata);
3022 if (ret2 < 0) {
3023 ret = ret2;
3024 done = 1;
3025 break;
3026 } else if (ret2 &&
3027 (!f2fs_compress_write_end(inode,
3028 fsdata, folio->index, 1) ||
3029 !f2fs_all_cluster_page_ready(&cc,
3030 &fbatch, i, nr_folios,
3031 false))) {
3032 retry = 1;
3033 break;
3034 }
3035 }
3036 #endif
3037 /* give a priority to WB_SYNC threads */
3038 if (atomic_read(&sbi->wb_sync_req[DATA]) &&
3039 wbc->sync_mode == WB_SYNC_NONE) {
3040 done = 1;
3041 break;
3042 }
3043 #ifdef CONFIG_F2FS_FS_COMPRESSION
3044 lock_folio:
3045 #endif
3046 done_index = folio->index;
3047 retry_write:
3048 folio_lock(folio);
3049
3050 if (unlikely(folio->mapping != mapping)) {
3051 continue_unlock:
3052 folio_unlock(folio);
3053 continue;
3054 }
3055
3056 if (!folio_test_dirty(folio)) {
3057 /* someone wrote it for us */
3058 goto continue_unlock;
3059 }
3060
3061 if (folio_test_writeback(folio)) {
3062 if (wbc->sync_mode != WB_SYNC_NONE)
3063 f2fs_wait_on_page_writeback(
3064 &folio->page,
3065 DATA, true, true);
3066 else
3067 goto continue_unlock;
3068 }
3069
3070 if (!folio_clear_dirty_for_io(folio))
3071 goto continue_unlock;
3072

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-09-02 21:46:12

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 14/23] f2fs: Convert f2fs_write_cache_pages() to use filemap_get_folios_tag()

Hi "Vishal,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on jaegeuk-f2fs/dev-test]
[also build test ERROR on kdave/for-next linus/master v6.0-rc3]
[cannot apply to ceph-client/for-linus next-20220901]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Vishal-Moola-Oracle/Convert-to-filemap_get_folios_tag/20220902-060430
base: https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git dev-test
config: arc-randconfig-r043-20220901 (https://download.01.org/0day-ci/archive/20220903/[email protected]/config)
compiler: arc-elf-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/6c74320953cd3749db95f9f09c1fc7d044933635
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Vishal-Moola-Oracle/Convert-to-filemap_get_folios_tag/20220902-060430
git checkout 6c74320953cd3749db95f9f09c1fc7d044933635
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arc SHELL=/bin/bash fs/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

fs/f2fs/data.c: In function 'f2fs_write_cache_pages':
>> fs/f2fs/data.c:3016:53: error: 'nr_pages' undeclared (first use in this function); did you mean 'dir_pages'?
3016 | &fbatch, i, nr_pages, true))
| ^~~~~~~~
| dir_pages
fs/f2fs/data.c:3016:53: note: each undeclared identifier is reported only once for each function it appears in
>> fs/f2fs/data.c:3017:41: error: label 'lock_page' used but not defined
3017 | goto lock_page;
| ^~~~


vim +3016 fs/f2fs/data.c

2908
2909 /*
2910 * This function was copied from write_cche_pages from mm/page-writeback.c.
2911 * The major change is making write step of cold data page separately from
2912 * warm/hot data page.
2913 */
2914 static int f2fs_write_cache_pages(struct address_space *mapping,
2915 struct writeback_control *wbc,
2916 enum iostat_type io_type)
2917 {
2918 int ret = 0;
2919 int done = 0, retry = 0;
2920 struct folio_batch fbatch;
2921 struct f2fs_sb_info *sbi = F2FS_M_SB(mapping);
2922 struct bio *bio = NULL;
2923 sector_t last_block;
2924 #ifdef CONFIG_F2FS_FS_COMPRESSION
2925 struct inode *inode = mapping->host;
2926 struct compress_ctx cc = {
2927 .inode = inode,
2928 .log_cluster_size = F2FS_I(inode)->i_log_cluster_size,
2929 .cluster_size = F2FS_I(inode)->i_cluster_size,
2930 .cluster_idx = NULL_CLUSTER,
2931 .rpages = NULL,
2932 .nr_rpages = 0,
2933 .cpages = NULL,
2934 .valid_nr_cpages = 0,
2935 .rbuf = NULL,
2936 .cbuf = NULL,
2937 .rlen = PAGE_SIZE * F2FS_I(inode)->i_cluster_size,
2938 .private = NULL,
2939 };
2940 #endif
2941 int nr_folios;
2942 pgoff_t index;
2943 pgoff_t end; /* Inclusive */
2944 pgoff_t done_index;
2945 int range_whole = 0;
2946 xa_mark_t tag;
2947 int nwritten = 0;
2948 int submitted = 0;
2949 int i;
2950
2951 folio_batch_init(&fbatch);
2952
2953 if (get_dirty_pages(mapping->host) <=
2954 SM_I(F2FS_M_SB(mapping))->min_hot_blocks)
2955 set_inode_flag(mapping->host, FI_HOT_DATA);
2956 else
2957 clear_inode_flag(mapping->host, FI_HOT_DATA);
2958
2959 if (wbc->range_cyclic) {
2960 index = mapping->writeback_index; /* prev offset */
2961 end = -1;
2962 } else {
2963 index = wbc->range_start >> PAGE_SHIFT;
2964 end = wbc->range_end >> PAGE_SHIFT;
2965 if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
2966 range_whole = 1;
2967 }
2968 if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
2969 tag = PAGECACHE_TAG_TOWRITE;
2970 else
2971 tag = PAGECACHE_TAG_DIRTY;
2972 retry:
2973 retry = 0;
2974 if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
2975 tag_pages_for_writeback(mapping, index, end);
2976 done_index = index;
2977 while (!done && !retry && (index <= end)) {
2978 nr_folios = filemap_get_folios_tag(mapping, &index, end,
2979 tag, &fbatch);
2980 if (nr_folios == 0)
2981 break;
2982
2983 for (i = 0; i < nr_folios; i++) {
2984 struct folio *folio = fbatch.folios[i];
2985 bool need_readd;
2986 readd:
2987 need_readd = false;
2988 #ifdef CONFIG_F2FS_FS_COMPRESSION
2989 if (f2fs_compressed_file(inode)) {
2990 void *fsdata = NULL;
2991 struct page *pagep;
2992 int ret2;
2993
2994 ret = f2fs_init_compress_ctx(&cc);
2995 if (ret) {
2996 done = 1;
2997 break;
2998 }
2999
3000 if (!f2fs_cluster_can_merge_page(&cc,
3001 folio->index)) {
3002 ret = f2fs_write_multi_pages(&cc,
3003 &submitted, wbc, io_type);
3004 if (!ret)
3005 need_readd = true;
3006 goto result;
3007 }
3008
3009 if (unlikely(f2fs_cp_error(sbi)))
3010 goto lock_folio;
3011
3012 if (!f2fs_cluster_is_empty(&cc))
3013 goto lock_folio;
3014
3015 if (f2fs_all_cluster_page_ready(&cc,
> 3016 &fbatch, i, nr_pages, true))
> 3017 goto lock_page;
3018
3019 ret2 = f2fs_prepare_compress_overwrite(
3020 inode, &pagep,
3021 folio->index, &fsdata);
3022 if (ret2 < 0) {
3023 ret = ret2;
3024 done = 1;
3025 break;
3026 } else if (ret2 &&
3027 (!f2fs_compress_write_end(inode,
3028 fsdata, folio->index, 1) ||
3029 !f2fs_all_cluster_page_ready(&cc,
3030 &fbatch, i, nr_folios,
3031 false))) {
3032 retry = 1;
3033 break;
3034 }
3035 }
3036 #endif
3037 /* give a priority to WB_SYNC threads */
3038 if (atomic_read(&sbi->wb_sync_req[DATA]) &&
3039 wbc->sync_mode == WB_SYNC_NONE) {
3040 done = 1;
3041 break;
3042 }
3043 #ifdef CONFIG_F2FS_FS_COMPRESSION
3044 lock_folio:
3045 #endif
3046 done_index = folio->index;
3047 retry_write:
3048 folio_lock(folio);
3049
3050 if (unlikely(folio->mapping != mapping)) {
3051 continue_unlock:
3052 folio_unlock(folio);
3053 continue;
3054 }
3055
3056 if (!folio_test_dirty(folio)) {
3057 /* someone wrote it for us */
3058 goto continue_unlock;
3059 }
3060
3061 if (folio_test_writeback(folio)) {
3062 if (wbc->sync_mode != WB_SYNC_NONE)
3063 f2fs_wait_on_page_writeback(
3064 &folio->page,
3065 DATA, true, true);
3066 else
3067 goto continue_unlock;
3068 }
3069
3070 if (!folio_clear_dirty_for_io(folio))
3071 goto continue_unlock;
3072

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-09-03 17:39:31

by Ryusuke Konishi

[permalink] [raw]
Subject: Re: [PATCH 20/23] nilfs2: Convert nilfs_btree_lookup_dirty_buffers() to use filemap_get_folios_tag()

On Fri, Sep 2, 2022 at 7:06 AM Vishal Moola (Oracle) wrote:
>
> Convert function to use folios throughout. This is in preparation for
> the removal of find_get_pages_range_tag().
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> ---
> fs/nilfs2/btree.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)

Acked-by: Ryusuke Konishi <[email protected]>

Thanks,
Ryusuke Konishi

2022-09-03 17:39:37

by Ryusuke Konishi

[permalink] [raw]
Subject: Re: [PATCH 22/23] nilfs2: Convert nilfs_clear_dirty_pages() to use filemap_get_folios_tag()

On Fri, Sep 2, 2022 at 7:14 AM Vishal Moola (Oracle) wrote:
>
> Convert function to use folios throughout. This is in preparation for
> the removal of find_get_pages_range_tag().
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> ---
> fs/nilfs2/page.c | 20 ++++++++++----------
> 1 file changed, 10 insertions(+), 10 deletions(-)

Acked-by: Ryusuke Konishi <[email protected]>

Thanks,
Ryusuke Konishi

2022-09-03 17:39:38

by Ryusuke Konishi

[permalink] [raw]
Subject: Re: [PATCH 19/23] nilfs2: Convert nilfs_lookup_dirty_node_buffers() to use filemap_get_folios_tag()

On Fri, Sep 2, 2022 at 7:07 AM Vishal Moola (Oracle) wrote:
>
> Convert function to use folios throughout. This is in preparation for
> the removal of find_get_pages_range_tag().
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> ---
> fs/nilfs2/segment.c | 15 +++++++--------
> 1 file changed, 7 insertions(+), 8 deletions(-)

Acked-by: Ryusuke Konishi <[email protected]>

Thanks,
Ryusuke Konishi

2022-09-03 17:39:58

by Ryusuke Konishi

[permalink] [raw]
Subject: Re: [PATCH 18/23] nilfs2: Convert nilfs_lookup_dirty_data_buffers() to use filemap_get_folios_tag()

On Fri, Sep 2, 2022 at 7:07 AM Vishal Moola (Oracle) wrote:
>
> Convert function to use folios throughout. This is in preparation for
> the removal of find_get_pages_range_tag().
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> ---
> fs/nilfs2/segment.c | 29 ++++++++++++++++-------------
> 1 file changed, 16 insertions(+), 13 deletions(-)
>
> diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
> index 0afe0832c754..e95c667bdc8f 100644
> --- a/fs/nilfs2/segment.c
> +++ b/fs/nilfs2/segment.c
> @@ -680,7 +680,7 @@ static size_t nilfs_lookup_dirty_data_buffers(struct inode *inode,
> loff_t start, loff_t end)
> {
> struct address_space *mapping = inode->i_mapping;
> - struct pagevec pvec;
> + struct folio_batch fbatch;
> pgoff_t index = 0, last = ULONG_MAX;
> size_t ndirties = 0;
> int i;
> @@ -694,23 +694,26 @@ static size_t nilfs_lookup_dirty_data_buffers(struct inode *inode,
> index = start >> PAGE_SHIFT;
> last = end >> PAGE_SHIFT;
> }
> - pagevec_init(&pvec);
> + folio_batch_init(&fbatch);
> repeat:
> if (unlikely(index > last) ||
> - !pagevec_lookup_range_tag(&pvec, mapping, &index, last,
> - PAGECACHE_TAG_DIRTY))
> + !filemap_get_folios_tag(mapping, &index, last,
> + PAGECACHE_TAG_DIRTY, &fbatch))
> return ndirties;
>
> - for (i = 0; i < pagevec_count(&pvec); i++) {
> + for (i = 0; i < folio_batch_count(&fbatch); i++) {
> struct buffer_head *bh, *head;
> - struct page *page = pvec.pages[i];
> + struct folio *folio = fbatch.folios[i];
>
> - lock_page(page);
> - if (!page_has_buffers(page))
> - create_empty_buffers(page, i_blocksize(inode), 0);
> - unlock_page(page);

> + head = folio_buffers(folio);
> + folio_lock(folio);

Could you please swap these two lines to keep the "head" check in the lock?

Thanks,
Ryusuke Konishi


> + if (!head) {
> + create_empty_buffers(&folio->page, i_blocksize(inode), 0);
> + head = folio_buffers(folio);
> + }
> + folio_unlock(folio);
>
> - bh = head = page_buffers(page);
> + bh = head;
> do {
> if (!buffer_dirty(bh) || buffer_async_write(bh))
> continue;
> @@ -718,13 +721,13 @@ static size_t nilfs_lookup_dirty_data_buffers(struct inode *inode,
> list_add_tail(&bh->b_assoc_buffers, listp);
> ndirties++;
> if (unlikely(ndirties >= nlimit)) {
> - pagevec_release(&pvec);
> + folio_batch_release(&fbatch);
> cond_resched();
> return ndirties;
> }
> } while (bh = bh->b_this_page, bh != head);
> }
> - pagevec_release(&pvec);
> + folio_batch_release(&fbatch);
> cond_resched();
> goto repeat;
> }
> --
> 2.36.1
>

2022-09-03 17:40:11

by Ryusuke Konishi

[permalink] [raw]
Subject: Re: [PATCH 21/23] nilfs2: Convert nilfs_copy_dirty_pages() to use filemap_get_folios_tag()

On Fri, Sep 2, 2022 at 7:18 AM Vishal Moola (Oracle) wrote:
>
> Convert function to use folios throughout. This is in preparation for
> the removal of find_get_pages_range_tag().
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> ---
> fs/nilfs2/page.c | 39 ++++++++++++++++++++-------------------
> 1 file changed, 20 insertions(+), 19 deletions(-)

Acked-by: Ryusuke Konishi <[email protected]>

Thanks,
Ryusuke Konishi

2022-10-14 14:12:43

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 05/23] afs: Convert afs_writepages_region() to use filemap_get_folios_tag()

Vishal Moola (Oracle) <[email protected]> wrote:

> Convert to use folios throughout. This function is in preparation to
> remove find_get_pages_range_tag().
>
> Also modified this function to write the whole batch one at a time,
> rather than calling for a new set every single write.
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>

Tested-by: David Howells <[email protected]>

2022-10-18 21:08:17

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH 04/23] page-writeback: Convert write_cache_pages() to use filemap_get_folios_tag()

On Thu, Sep 01, 2022 at 03:01:19PM -0700, Vishal Moola (Oracle) wrote:
> Converted function to use folios throughout. This is in preparation for
> the removal of find_get_pages_range_tag().
>
> Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> ---
> mm/page-writeback.c | 44 +++++++++++++++++++++++---------------------
> 1 file changed, 23 insertions(+), 21 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 032a7bf8d259..087165357a5a 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2285,15 +2285,15 @@ int write_cache_pages(struct address_space *mapping,
> int ret = 0;
> int done = 0;
> int error;
> - struct pagevec pvec;
> - int nr_pages;
> + struct folio_batch fbatch;
> + int nr_folios;
> pgoff_t index;
> pgoff_t end; /* Inclusive */
> pgoff_t done_index;
> int range_whole = 0;
> xa_mark_t tag;
>
> - pagevec_init(&pvec);
> + folio_batch_init(&fbatch);
> if (wbc->range_cyclic) {
> index = mapping->writeback_index; /* prev offset */
> end = -1;
> @@ -2313,17 +2313,18 @@ int write_cache_pages(struct address_space *mapping,
> while (!done && (index <= end)) {
> int i;
>
> - nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
> - tag);
> - if (nr_pages == 0)
> + nr_folios = filemap_get_folios_tag(mapping, &index, end,
> + tag, &fbatch);

This can find and return dirty multi-page folios if the filesystem
enables them in the mapping at instantiation time, right?

> +
> + if (nr_folios == 0)
> break;
>
> - for (i = 0; i < nr_pages; i++) {
> - struct page *page = pvec.pages[i];
> + for (i = 0; i < nr_folios; i++) {
> + struct folio *folio = fbatch.folios[i];
>
> - done_index = page->index;
> + done_index = folio->index;
>
> - lock_page(page);
> + folio_lock(folio);
>
> /*
> * Page truncated or invalidated. We can freely skip it
> @@ -2333,30 +2334,30 @@ int write_cache_pages(struct address_space *mapping,
> * even if there is now a new, dirty page at the same
> * pagecache address.
> */
> - if (unlikely(page->mapping != mapping)) {
> + if (unlikely(folio->mapping != mapping)) {
> continue_unlock:
> - unlock_page(page);
> + folio_unlock(folio);
> continue;
> }
>
> - if (!PageDirty(page)) {
> + if (!folio_test_dirty(folio)) {
> /* someone wrote it for us */
> goto continue_unlock;
> }
>
> - if (PageWriteback(page)) {
> + if (folio_test_writeback(folio)) {
> if (wbc->sync_mode != WB_SYNC_NONE)
> - wait_on_page_writeback(page);
> + folio_wait_writeback(folio);
> else
> goto continue_unlock;
> }
>
> - BUG_ON(PageWriteback(page));
> - if (!clear_page_dirty_for_io(page))
> + BUG_ON(folio_test_writeback(folio));
> + if (!folio_clear_dirty_for_io(folio))
> goto continue_unlock;
>
> trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> - error = (*writepage)(page, wbc, data);
> + error = writepage(&folio->page, wbc, data);

Yet, IIUC, this treats all folios as if they are single page folios.
i.e. it passes the head page of a multi-page folio to a callback
that will treat it as a single PAGE_SIZE page, because that's all
the writepage callbacks are currently expected to be passed...

So won't this break writeback of dirty multipage folios?

-Dave.
--
Dave Chinner
[email protected]

2022-10-18 21:50:09

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH 00/23] Convert to filemap_get_folios_tag()

On Thu, Sep 01, 2022 at 03:01:15PM -0700, Vishal Moola (Oracle) wrote:
> This patch series replaces find_get_pages_range_tag() with
> filemap_get_folios_tag(). This also allows the removal of multiple
> calls to compound_head() throughout.
> It also makes a good chunk of the straightforward conversions to folios,
> and takes the opportunity to introduce a function that grabs a folio
> from the pagecache.
>
> F2fs and Ceph have quite alot of work to be done regarding folios, so
> for now those patches only have the changes necessary for the removal of
> find_get_pages_range_tag(), and only support folios of size 1 (which is
> all they use right now anyways).
>
> I've run xfstests on btrfs, ext4, f2fs, and nilfs2, but more testing may be
> beneficial.

Well, that answers my question about how filesystems that enable
multi-page folios were tested: they weren't.

I'd suggest that anyone working on further extending the
filemap/folio infrastructure really needs to be testing XFS as a
first priority, and then other filesystems as a secondary concern.

That's because XFS (via the fs/iomap infrastructure) is one of only
3 filesystems in the kernel (AFS and tmpfs are the others) that
interact with the page cache and page cache "pages" solely via folio
interfaces. As such they are able to support multi-page folios in
the page cache. All of the tested filesystems still use the fixed
PAGE_SIZE page interfaces to interact with the page cache, so they
don't actually exercise interactions with multi-page folios at all.

Hence if you are converting generic infrastructure that looks up
pages in the page cache to look up folios in the page cache, the
code that processes the returned folios also needs to be updated and
validated to ensure that it correctly handles multi-page folios. And
the only way you can do that fully at this point in time is via
testing XFS or AFS...

Cheers,

Dave.
--
Dave Chinner
[email protected]

2022-11-03 22:02:56

by Vishal Moola

[permalink] [raw]
Subject: Re: [PATCH 00/23] Convert to filemap_get_folios_tag()

On Wed, Oct 19, 2022 at 08:45:44AM +1100, Dave Chinner wrote:
> On Thu, Sep 01, 2022 at 03:01:15PM -0700, Vishal Moola (Oracle) wrote:
> > This patch series replaces find_get_pages_range_tag() with
> > filemap_get_folios_tag(). This also allows the removal of multiple
> > calls to compound_head() throughout.
> > It also makes a good chunk of the straightforward conversions to folios,
> > and takes the opportunity to introduce a function that grabs a folio
> > from the pagecache.
> >
> > F2fs and Ceph have quite alot of work to be done regarding folios, so
> > for now those patches only have the changes necessary for the removal of
> > find_get_pages_range_tag(), and only support folios of size 1 (which is
> > all they use right now anyways).
> >
> > I've run xfstests on btrfs, ext4, f2fs, and nilfs2, but more testing may be
> > beneficial.
>
> Well, that answers my question about how filesystems that enable
> multi-page folios were tested: they weren't.
>
> I'd suggest that anyone working on further extending the
> filemap/folio infrastructure really needs to be testing XFS as a
> first priority, and then other filesystems as a secondary concern.
>
> That's because XFS (via the fs/iomap infrastructure) is one of only
> 3 filesystems in the kernel (AFS and tmpfs are the others) that
> interact with the page cache and page cache "pages" solely via folio
> interfaces. As such they are able to support multi-page folios in
> the page cache. All of the tested filesystems still use the fixed
> PAGE_SIZE page interfaces to interact with the page cache, so they
> don't actually exercise interactions with multi-page folios at all.
>

Thanks for the explanation! That makes perfect sense. I wholeheartedly
agree, and I'll be sure to test any future changes on XFS to try to
ensure multi-page folio functionality.

I know David ran tests on AFS, so hopefully those hit multipage folios
well enough. But I'm not sure whether it was just for the AFS patch or
with the whole series applied. Regardless I'll run my own set of tests
on XFS and see if I run into any issues as well.

2022-11-03 22:32:29

by Vishal Moola

[permalink] [raw]
Subject: Re: [PATCH 04/23] page-writeback: Convert write_cache_pages() to use filemap_get_folios_tag()

On Wed, Oct 19, 2022 at 08:01:52AM +1100, Dave Chinner wrote:
> On Thu, Sep 01, 2022 at 03:01:19PM -0700, Vishal Moola (Oracle) wrote:
> > Converted function to use folios throughout. This is in preparation for
> > the removal of find_get_pages_range_tag().
> >
> > Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> > ---
> > mm/page-writeback.c | 44 +++++++++++++++++++++++---------------------
> > 1 file changed, 23 insertions(+), 21 deletions(-)
> >
> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index 032a7bf8d259..087165357a5a 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -2285,15 +2285,15 @@ int write_cache_pages(struct address_space *mapping,
> > int ret = 0;
> > int done = 0;
> > int error;
> > - struct pagevec pvec;
> > - int nr_pages;
> > + struct folio_batch fbatch;
> > + int nr_folios;
> > pgoff_t index;
> > pgoff_t end; /* Inclusive */
> > pgoff_t done_index;
> > int range_whole = 0;
> > xa_mark_t tag;
> >
> > - pagevec_init(&pvec);
> > + folio_batch_init(&fbatch);
> > if (wbc->range_cyclic) {
> > index = mapping->writeback_index; /* prev offset */
> > end = -1;
> > @@ -2313,17 +2313,18 @@ int write_cache_pages(struct address_space *mapping,
> > while (!done && (index <= end)) {
> > int i;
> >
> > - nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
> > - tag);
> > - if (nr_pages == 0)
> > + nr_folios = filemap_get_folios_tag(mapping, &index, end,
> > + tag, &fbatch);
>
> This can find and return dirty multi-page folios if the filesystem
> enables them in the mapping at instantiation time, right?

Yup, it will.

> > +
> > + if (nr_folios == 0)
> > break;
> >
> > - for (i = 0; i < nr_pages; i++) {
> > - struct page *page = pvec.pages[i];
> > + for (i = 0; i < nr_folios; i++) {
> > + struct folio *folio = fbatch.folios[i];
> >
> > - done_index = page->index;
> > + done_index = folio->index;
> >
> > - lock_page(page);
> > + folio_lock(folio);
> >
> > /*
> > * Page truncated or invalidated. We can freely skip it
> > @@ -2333,30 +2334,30 @@ int write_cache_pages(struct address_space *mapping,
> > * even if there is now a new, dirty page at the same
> > * pagecache address.
> > */
> > - if (unlikely(page->mapping != mapping)) {
> > + if (unlikely(folio->mapping != mapping)) {
> > continue_unlock:
> > - unlock_page(page);
> > + folio_unlock(folio);
> > continue;
> > }
> >
> > - if (!PageDirty(page)) {
> > + if (!folio_test_dirty(folio)) {
> > /* someone wrote it for us */
> > goto continue_unlock;
> > }
> >
> > - if (PageWriteback(page)) {
> > + if (folio_test_writeback(folio)) {
> > if (wbc->sync_mode != WB_SYNC_NONE)
> > - wait_on_page_writeback(page);
> > + folio_wait_writeback(folio);
> > else
> > goto continue_unlock;
> > }
> >
> > - BUG_ON(PageWriteback(page));
> > - if (!clear_page_dirty_for_io(page))
> > + BUG_ON(folio_test_writeback(folio));
> > + if (!folio_clear_dirty_for_io(folio))
> > goto continue_unlock;
> >
> > trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> > - error = (*writepage)(page, wbc, data);
> > + error = writepage(&folio->page, wbc, data);
>
> Yet, IIUC, this treats all folios as if they are single page folios.
> i.e. it passes the head page of a multi-page folio to a callback
> that will treat it as a single PAGE_SIZE page, because that's all
> the writepage callbacks are currently expected to be passed...
>
> So won't this break writeback of dirty multipage folios?

Yes, it appears it would. But it wouldn't because its already 'broken'.

The current find_get_pages_range_tag() actually has the exact same
issue. The current code to fill up the pages array is:

pages[ret] = &folio->page;
if (++ret == nr_pages) {
*index = folio->index + folio_nr_pages(folio);
goto out;

which behaves the same way as the issue you pointed out (both break
large folios). When I spoke to Matthew about this earlier, we decided
to go ahead with replacing the function and leave it up to the callers
to fix/handle large folios when the filesystem gets to it.

Its not great to leave it 'broken' but its something that isn't - or at
least shouldn't be - creating any problems at present. And I believe Matthew
has plans to address them at some point before they actually become problems?

2022-11-04 00:37:56

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH 04/23] page-writeback: Convert write_cache_pages() to use filemap_get_folios_tag()

On Thu, Nov 03, 2022 at 03:28:05PM -0700, Vishal Moola wrote:
> On Wed, Oct 19, 2022 at 08:01:52AM +1100, Dave Chinner wrote:
> > On Thu, Sep 01, 2022 at 03:01:19PM -0700, Vishal Moola (Oracle) wrote:
> > > Converted function to use folios throughout. This is in preparation for
> > > the removal of find_get_pages_range_tag().
> > >
> > > Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> > > ---
> > > mm/page-writeback.c | 44 +++++++++++++++++++++++---------------------
> > > 1 file changed, 23 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > > index 032a7bf8d259..087165357a5a 100644
> > > --- a/mm/page-writeback.c
> > > +++ b/mm/page-writeback.c
> > > @@ -2285,15 +2285,15 @@ int write_cache_pages(struct address_space *mapping,
> > > int ret = 0;
> > > int done = 0;
> > > int error;
> > > - struct pagevec pvec;
> > > - int nr_pages;
> > > + struct folio_batch fbatch;
> > > + int nr_folios;
> > > pgoff_t index;
> > > pgoff_t end; /* Inclusive */
> > > pgoff_t done_index;
> > > int range_whole = 0;
> > > xa_mark_t tag;
> > >
> > > - pagevec_init(&pvec);
> > > + folio_batch_init(&fbatch);
> > > if (wbc->range_cyclic) {
> > > index = mapping->writeback_index; /* prev offset */
> > > end = -1;
> > > @@ -2313,17 +2313,18 @@ int write_cache_pages(struct address_space *mapping,
> > > while (!done && (index <= end)) {
> > > int i;
> > >
> > > - nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
> > > - tag);
> > > - if (nr_pages == 0)
> > > + nr_folios = filemap_get_folios_tag(mapping, &index, end,
> > > + tag, &fbatch);
> >
> > This can find and return dirty multi-page folios if the filesystem
> > enables them in the mapping at instantiation time, right?
>
> Yup, it will.
>
> > > +
> > > + if (nr_folios == 0)
> > > break;
> > >
> > > - for (i = 0; i < nr_pages; i++) {
> > > - struct page *page = pvec.pages[i];
> > > + for (i = 0; i < nr_folios; i++) {
> > > + struct folio *folio = fbatch.folios[i];
> > >
> > > - done_index = page->index;
> > > + done_index = folio->index;
> > >
> > > - lock_page(page);
> > > + folio_lock(folio);
> > >
> > > /*
> > > * Page truncated or invalidated. We can freely skip it
> > > @@ -2333,30 +2334,30 @@ int write_cache_pages(struct address_space *mapping,
> > > * even if there is now a new, dirty page at the same
> > > * pagecache address.
> > > */
> > > - if (unlikely(page->mapping != mapping)) {
> > > + if (unlikely(folio->mapping != mapping)) {
> > > continue_unlock:
> > > - unlock_page(page);
> > > + folio_unlock(folio);
> > > continue;
> > > }
> > >
> > > - if (!PageDirty(page)) {
> > > + if (!folio_test_dirty(folio)) {
> > > /* someone wrote it for us */
> > > goto continue_unlock;
> > > }
> > >
> > > - if (PageWriteback(page)) {
> > > + if (folio_test_writeback(folio)) {
> > > if (wbc->sync_mode != WB_SYNC_NONE)
> > > - wait_on_page_writeback(page);
> > > + folio_wait_writeback(folio);
> > > else
> > > goto continue_unlock;
> > > }
> > >
> > > - BUG_ON(PageWriteback(page));
> > > - if (!clear_page_dirty_for_io(page))
> > > + BUG_ON(folio_test_writeback(folio));
> > > + if (!folio_clear_dirty_for_io(folio))
> > > goto continue_unlock;
> > >
> > > trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> > > - error = (*writepage)(page, wbc, data);
> > > + error = writepage(&folio->page, wbc, data);
> >
> > Yet, IIUC, this treats all folios as if they are single page folios.
> > i.e. it passes the head page of a multi-page folio to a callback
> > that will treat it as a single PAGE_SIZE page, because that's all
> > the writepage callbacks are currently expected to be passed...
> >
> > So won't this break writeback of dirty multipage folios?
>
> Yes, it appears it would. But it wouldn't because its already 'broken'.

It is? Then why isn't XFS broken on existing kernels? Oh, we don't
know because it hasn't been tested?

Seriously - if this really is broken, and this patchset further
propagating the brokeness, then somebody needs to explain to me why
this is not corrupting data in XFS.

I get it that page/folios are in transition, but passing a
multi-page folio page to an interface that expects a PAGE_SIZE
struct page is a pretty nasty landmine, regardless of how broken the
higher level iteration code already might be.

At minimum, it needs to be documented, though I'd much prefer that
we explicitly duplicate write_cache_pages() as write_cache_folios()
with a callback that takes a folio and change the code to be fully
multi-page folio safe. Then filesystems that support folios (and
large folios) natively can be passed folios without going through
this crappy "folio->page, page->folio" dance because the writepage
APIs are unaware of multi-page folio constructs.

Then you can convert the individual filesystems using
write_cache_pages() to call write_cache_folios() one at a time,
updating the filesystem callback to do the conversion from folio to
struct page and checking that it an order-0 page that it has been
handed....

> The current find_get_pages_range_tag() actually has the exact same
> issue. The current code to fill up the pages array is:
>
> pages[ret] = &folio->page;
> if (++ret == nr_pages) {
> *index = folio->index + folio_nr_pages(folio);
> goto out;

"It's already broken so we can make it more broken" isn't an
acceptible answer....

> Its not great to leave it 'broken' but its something that isn't - or at
> least shouldn't be - creating any problems at present. And I believe Matthew
> has plans to address them at some point before they actually become problems?

You are modifying the interfaces and doing folio conversions that
expose and propagate the brokenness. The brokeness needs to be
either avoided or fixed and not propagated further. Doing the above
write_cache_folios() conversion avoids the propagating the
brokenness, adds runtime detection of brokenness, and provides the
right interface for writeback iteration of folios.

Fixing the generic writeback iterator properly is not much extra
work, and it sets the model for filesytsems that have copy-pasted
write_cache_pages() and then hacked it around for their own purposes
(e.g. ext4, btrfs) to follow.

-Dave.
--
Dave Chinner
[email protected]

2022-11-04 02:49:10

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH 04/23] page-writeback: Convert write_cache_pages() to use filemap_get_folios_tag()

On Fri, Nov 04, 2022 at 11:32:35AM +1100, Dave Chinner wrote:
> On Thu, Nov 03, 2022 at 03:28:05PM -0700, Vishal Moola wrote:
> > On Wed, Oct 19, 2022 at 08:01:52AM +1100, Dave Chinner wrote:
> > > On Thu, Sep 01, 2022 at 03:01:19PM -0700, Vishal Moola (Oracle) wrote:
> > > > Converted function to use folios throughout. This is in preparation for
> > > > the removal of find_get_pages_range_tag().
> > > >
> > > > Signed-off-by: Vishal Moola (Oracle) <[email protected]>
> > > > ---
> > > > mm/page-writeback.c | 44 +++++++++++++++++++++++---------------------
> > > > 1 file changed, 23 insertions(+), 21 deletions(-)
> > > >
> > > > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > > > index 032a7bf8d259..087165357a5a 100644
> > > > --- a/mm/page-writeback.c
> > > > +++ b/mm/page-writeback.c
> > > > @@ -2285,15 +2285,15 @@ int write_cache_pages(struct address_space *mapping,
> > > > int ret = 0;
> > > > int done = 0;
> > > > int error;
> > > > - struct pagevec pvec;
> > > > - int nr_pages;
> > > > + struct folio_batch fbatch;
> > > > + int nr_folios;
> > > > pgoff_t index;
> > > > pgoff_t end; /* Inclusive */
> > > > pgoff_t done_index;
> > > > int range_whole = 0;
> > > > xa_mark_t tag;
> > > >
> > > > - pagevec_init(&pvec);
> > > > + folio_batch_init(&fbatch);
> > > > if (wbc->range_cyclic) {
> > > > index = mapping->writeback_index; /* prev offset */
> > > > end = -1;
> > > > @@ -2313,17 +2313,18 @@ int write_cache_pages(struct address_space *mapping,
> > > > while (!done && (index <= end)) {
> > > > int i;
> > > >
> > > > - nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
> > > > - tag);
> > > > - if (nr_pages == 0)
> > > > + nr_folios = filemap_get_folios_tag(mapping, &index, end,
> > > > + tag, &fbatch);
> > >
> > > This can find and return dirty multi-page folios if the filesystem
> > > enables them in the mapping at instantiation time, right?
> >
> > Yup, it will.
> >
> > > > +
> > > > + if (nr_folios == 0)
> > > > break;
> > > >
> > > > - for (i = 0; i < nr_pages; i++) {
> > > > - struct page *page = pvec.pages[i];
> > > > + for (i = 0; i < nr_folios; i++) {
> > > > + struct folio *folio = fbatch.folios[i];
> > > >
> > > > - done_index = page->index;
> > > > + done_index = folio->index;
> > > >
> > > > - lock_page(page);
> > > > + folio_lock(folio);
> > > >
> > > > /*
> > > > * Page truncated or invalidated. We can freely skip it
> > > > @@ -2333,30 +2334,30 @@ int write_cache_pages(struct address_space *mapping,
> > > > * even if there is now a new, dirty page at the same
> > > > * pagecache address.
> > > > */
> > > > - if (unlikely(page->mapping != mapping)) {
> > > > + if (unlikely(folio->mapping != mapping)) {
> > > > continue_unlock:
> > > > - unlock_page(page);
> > > > + folio_unlock(folio);
> > > > continue;
> > > > }
> > > >
> > > > - if (!PageDirty(page)) {
> > > > + if (!folio_test_dirty(folio)) {
> > > > /* someone wrote it for us */
> > > > goto continue_unlock;
> > > > }
> > > >
> > > > - if (PageWriteback(page)) {
> > > > + if (folio_test_writeback(folio)) {
> > > > if (wbc->sync_mode != WB_SYNC_NONE)
> > > > - wait_on_page_writeback(page);
> > > > + folio_wait_writeback(folio);
> > > > else
> > > > goto continue_unlock;
> > > > }
> > > >
> > > > - BUG_ON(PageWriteback(page));
> > > > - if (!clear_page_dirty_for_io(page))
> > > > + BUG_ON(folio_test_writeback(folio));
> > > > + if (!folio_clear_dirty_for_io(folio))
> > > > goto continue_unlock;
> > > >
> > > > trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> > > > - error = (*writepage)(page, wbc, data);
> > > > + error = writepage(&folio->page, wbc, data);
> > >
> > > Yet, IIUC, this treats all folios as if they are single page folios.
> > > i.e. it passes the head page of a multi-page folio to a callback
> > > that will treat it as a single PAGE_SIZE page, because that's all
> > > the writepage callbacks are currently expected to be passed...
> > >
> > > So won't this break writeback of dirty multipage folios?
> >
> > Yes, it appears it would. But it wouldn't because its already 'broken'.
>
> It is? Then why isn't XFS broken on existing kernels? Oh, we don't
> know because it hasn't been tested?
>
> Seriously - if this really is broken, and this patchset further
> propagating the brokeness, then somebody needs to explain to me why
> this is not corrupting data in XFS.

It looks like iomap_do_writepage finds the folio size correctly

end_pos = folio_pos(folio) + folio_size(folio);

and iomap_writpage_map will map out the correct number of blocks

unsigned nblocks = i_blocks_per_folio(inode, folio);

for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) {

right? The interface is dangerous because anyone who enables multipage
folios has to be aware that ->writepage can be handed a multipage folio.

(That said, the lack of mention of xfs in the testing plan doesn't give
me much confidence anyone has checked this...)

> I get it that page/folios are in transition, but passing a
> multi-page folio page to an interface that expects a PAGE_SIZE
> struct page is a pretty nasty landmine, regardless of how broken the
> higher level iteration code already might be.
>
> At minimum, it needs to be documented, though I'd much prefer that
> we explicitly duplicate write_cache_pages() as write_cache_folios()
> with a callback that takes a folio and change the code to be fully
> multi-page folio safe. Then filesystems that support folios (and
> large folios) natively can be passed folios without going through
> this crappy "folio->page, page->folio" dance because the writepage
> APIs are unaware of multi-page folio constructs.

Agree. Build the new one, move callers over, and kill the old one.

> Then you can convert the individual filesystems using
> write_cache_pages() to call write_cache_folios() one at a time,
> updating the filesystem callback to do the conversion from folio to
> struct page and checking that it an order-0 page that it has been
> handed....
>
> > The current find_get_pages_range_tag() actually has the exact same
> > issue. The current code to fill up the pages array is:
> >
> > pages[ret] = &folio->page;
> > if (++ret == nr_pages) {
> > *index = folio->index + folio_nr_pages(folio);
> > goto out;
>
> "It's already broken so we can make it more broken" isn't an
> acceptible answer....
>
> > Its not great to leave it 'broken' but its something that isn't - or at
> > least shouldn't be - creating any problems at present. And I believe Matthew
> > has plans to address them at some point before they actually become problems?
>
> You are modifying the interfaces and doing folio conversions that
> expose and propagate the brokenness. The brokeness needs to be
> either avoided or fixed and not propagated further. Doing the above
> write_cache_folios() conversion avoids the propagating the
> brokenness, adds runtime detection of brokenness, and provides the
> right interface for writeback iteration of folios.
>
> Fixing the generic writeback iterator properly is not much extra
> work, and it sets the model for filesytsems that have copy-pasted
> write_cache_pages() and then hacked it around for their own purposes
> (e.g. ext4, btrfs) to follow.
>
> -Dave.
> --
> Dave Chinner
> [email protected]

2022-11-04 03:39:36

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH 04/23] page-writeback: Convert write_cache_pages() to use filemap_get_folios_tag()

On Thu, Nov 03, 2022 at 07:45:01PM -0700, Darrick J. Wong wrote:
> On Fri, Nov 04, 2022 at 11:32:35AM +1100, Dave Chinner wrote:
> > On Thu, Nov 03, 2022 at 03:28:05PM -0700, Vishal Moola wrote:
> > > On Wed, Oct 19, 2022 at 08:01:52AM +1100, Dave Chinner wrote:
> > > > On Thu, Sep 01, 2022 at 03:01:19PM -0700, Vishal Moola (Oracle) wrote:
> > > > > - BUG_ON(PageWriteback(page));
> > > > > - if (!clear_page_dirty_for_io(page))
> > > > > + BUG_ON(folio_test_writeback(folio));
> > > > > + if (!folio_clear_dirty_for_io(folio))
> > > > > goto continue_unlock;
> > > > >
> > > > > trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> > > > > - error = (*writepage)(page, wbc, data);
> > > > > + error = writepage(&folio->page, wbc, data);
> > > >
> > > > Yet, IIUC, this treats all folios as if they are single page folios.
> > > > i.e. it passes the head page of a multi-page folio to a callback
> > > > that will treat it as a single PAGE_SIZE page, because that's all
> > > > the writepage callbacks are currently expected to be passed...
> > > >
> > > > So won't this break writeback of dirty multipage folios?
> > >
> > > Yes, it appears it would. But it wouldn't because its already 'broken'.
> >
> > It is? Then why isn't XFS broken on existing kernels? Oh, we don't
> > know because it hasn't been tested?
> >
> > Seriously - if this really is broken, and this patchset further
> > propagating the brokeness, then somebody needs to explain to me why
> > this is not corrupting data in XFS.
>
> It looks like iomap_do_writepage finds the folio size correctly
>
> end_pos = folio_pos(folio) + folio_size(folio);
>
> and iomap_writpage_map will map out the correct number of blocks
>
> unsigned nblocks = i_blocks_per_folio(inode, folio);
>
> for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) {
>
> right?

Yup, that's how I read it, too.

But my recent experience with folios involved being repeatedly
burnt by edge case corruptions due to multipage folios showing up
when and where I least expected them.

Hence doing a 1:1 conversion of page based code to folio based code
and just assuming large folios will work without any testing seems
akin to playing russian roulette with loose cannons that have been
doused with napalm and then set on fire by an air-dropped barrel
bomb...

Cheers,

Dave.
--
Dave Chinner
[email protected]

2022-11-04 15:30:27

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 04/23] page-writeback: Convert write_cache_pages() to use filemap_get_folios_tag()

On Wed, Oct 19, 2022 at 08:01:52AM +1100, Dave Chinner wrote:
> On Thu, Sep 01, 2022 at 03:01:19PM -0700, Vishal Moola (Oracle) wrote:
> > @@ -2313,17 +2313,18 @@ int write_cache_pages(struct address_space *mapping,
> > while (!done && (index <= end)) {
> > int i;
> >
> > - nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end,
> > - tag);
> > - if (nr_pages == 0)
> > + nr_folios = filemap_get_folios_tag(mapping, &index, end,
> > + tag, &fbatch);
>
> This can find and return dirty multi-page folios if the filesystem
> enables them in the mapping at instantiation time, right?

Correct. Just like before the patch. pagevec_lookup_range_tag() has
only ever returned head pages, never tail pages. This is probably
because shmem (which was our only fs that supported compound pages)
never supported writeback, so never looked up pages by tag.

> > trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> > - error = (*writepage)(page, wbc, data);
> > + error = writepage(&folio->page, wbc, data);
>
> Yet, IIUC, this treats all folios as if they are single page folios.
> i.e. it passes the head page of a multi-page folio to a callback
> that will treat it as a single PAGE_SIZE page, because that's all
> the writepage callbacks are currently expected to be passed...
>
> So won't this break writeback of dirty multipage folios?

No. A filesystem only sets the flag to create multipage folios once its
writeback callback handles multipage folios correctly (amongst many other
things that have to be fixed and tested). I haven't written down all
the things that a filesystem maintainer needs to check at least partly
because I don't know how representative XFS/iomap are of all filesystems.


2022-11-04 20:16:14

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 04/23] page-writeback: Convert write_cache_pages() to use filemap_get_folios_tag()

On Fri, Nov 04, 2022 at 11:32:35AM +1100, Dave Chinner wrote:
> At minimum, it needs to be documented, though I'd much prefer that
> we explicitly duplicate write_cache_pages() as write_cache_folios()
> with a callback that takes a folio and change the code to be fully
> multi-page folio safe. Then filesystems that support folios (and
> large folios) natively can be passed folios without going through
> this crappy "folio->page, page->folio" dance because the writepage
> APIs are unaware of multi-page folio constructs.

There are a lot of places which go through the folio->page->folio
dance, and this one wasn't even close to the top of my list. That
said, it has a fairly small number of callers -- ext4, fuse, iomap,
mpage, nfs, orangefs. So Vishal, this seems like a good project for you
to take on next -- convert write_cache_pages() to write_cache_folios()
and writepage_t to write_folio_t.