2023-01-26 20:24:57

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 00/31] Convert most of ext4 to folios

This, on top of a number of patches currently in next and a few patches
sent to the mailing lists earlier today, converts most of ext4 to use
folios instead of pages. It does not add support for large folios.
It does not convert mballoc to use folios. write_begin() and write_end()
still take a page parameter instead of a folio.

It does convert a lot of code away from the page APIs that we're trying
to remove. It does remove a lot of calls to compound_head(). I'd like
to see it land in 6.4, so getting it in for review early.

Matthew Wilcox (Oracle) (31):
fs: Add FGP_WRITEBEGIN
fscrypt: Add some folio helper functions
ext4: Convert ext4_bio_write_page() to use a folio
ext4: Convert ext4_finish_bio() to use folios
ext4: Convert ext4_writepage() to use a folio
ext4: Turn mpage_process_page() into mpage_process_folio()
ext4: Convert mpage_submit_page() to mpage_submit_folio()
ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio()
ext4: Convert ext4_readpage_inline() to take a folio
ext4: Convert ext4_convert_inline_data_to_extent() to use a folio
ext4: Convert ext4_try_to_write_inline_data() to use a folio
ext4: Convert ext4_da_convert_inline_data_to_extent() to use a folio
ext4: Convert ext4_da_write_inline_data_begin() to use a folio
ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio()
ext4: Convert ext4_write_inline_data_end() to use a folio
ext4: Convert ext4_write_begin() to use a folio
ext4: Convert ext4_write_end() to use a folio
ext4: Use a folio in ext4_journalled_write_end()
ext4: Convert ext4_journalled_zero_new_buffers() to use a folio
ext4: Convert __ext4_block_zero_page_range() to use a folio
ext4: Convert __ext4_journalled_writepage() to take a folio
ext4: Convert ext4_page_nomap_can_writeout() to take a folio
ext4: Use a folio in ext4_da_write_begin()
ext4: Convert ext4_mpage_readpages() to work on folios
ext4: Convert ext4_block_write_begin() to take a folio
ext4: Convert ext4_writepage() to take a folio
ext4: Use a folio in ext4_page_mkwrite()
ext4: Use a folio iterator in __read_end_io()
ext4: Convert mext_page_mkuptodate() to take a folio
ext4: Convert pagecache_read() to use a folio
ext4: Use a folio in ext4_read_merkle_tree_page

fs/ext4/ext4.h | 9 +-
fs/ext4/inline.c | 171 ++++++++--------
fs/ext4/inode.c | 394 +++++++++++++++++++------------------
fs/ext4/move_extent.c | 33 ++--
fs/ext4/page-io.c | 98 +++++----
fs/ext4/readpage.c | 72 ++++---
fs/ext4/verity.c | 30 ++-
fs/iomap/buffered-io.c | 2 +-
fs/netfs/buffered_read.c | 3 +-
include/linux/fscrypt.h | 21 ++
include/linux/page-flags.h | 5 -
include/linux/pagemap.h | 2 +
mm/folio-compat.c | 4 +-
13 files changed, 424 insertions(+), 420 deletions(-)

--
2.35.1



2023-01-26 20:25:13

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio

Remove several calls to compound_head() and the last caller of
set_page_writeback_keepwrite(), so remove the wrapper too.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/ext4/page-io.c | 58 ++++++++++++++++++--------------------
include/linux/page-flags.h | 5 ----
2 files changed, 27 insertions(+), 36 deletions(-)

diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index beaec6d81074..982791050892 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -409,11 +409,9 @@ static void io_submit_init_bio(struct ext4_io_submit *io,

static void io_submit_add_bh(struct ext4_io_submit *io,
struct inode *inode,
- struct page *page,
+ struct folio *folio,
struct buffer_head *bh)
{
- int ret;
-
if (io->io_bio && (bh->b_blocknr != io->io_next_block ||
!fscrypt_mergeable_bio_bh(io->io_bio, bh))) {
submit_and_retry:
@@ -421,10 +419,9 @@ static void io_submit_add_bh(struct ext4_io_submit *io,
}
if (io->io_bio == NULL)
io_submit_init_bio(io, bh);
- ret = bio_add_page(io->io_bio, page, bh->b_size, bh_offset(bh));
- if (ret != bh->b_size)
+ if (!bio_add_folio(io->io_bio, folio, bh->b_size, bh_offset(bh)))
goto submit_and_retry;
- wbc_account_cgroup_owner(io->io_wbc, page, bh->b_size);
+ wbc_account_cgroup_owner(io->io_wbc, &folio->page, bh->b_size);
io->io_next_block++;
}

@@ -432,8 +429,9 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
struct page *page,
int len)
{
- struct page *bounce_page = NULL;
- struct inode *inode = page->mapping->host;
+ struct folio *folio = page_folio(page);
+ struct folio *io_folio = folio;
+ struct inode *inode = folio->mapping->host;
unsigned block_start;
struct buffer_head *bh, *head;
int ret = 0;
@@ -441,30 +439,30 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
struct writeback_control *wbc = io->io_wbc;
bool keep_towrite = false;

- BUG_ON(!PageLocked(page));
- BUG_ON(PageWriteback(page));
+ BUG_ON(!folio_test_locked(folio));
+ BUG_ON(folio_test_writeback(folio));

- ClearPageError(page);
+ folio_clear_error(folio);

/*
* Comments copied from block_write_full_page:
*
- * The page straddles i_size. It must be zeroed out on each and every
+ * The folio straddles i_size. It must be zeroed out on each and every
* writepage invocation because it may be mmapped. "A file is mapped
* in multiples of the page size. For a file that is not a multiple of
* the page size, the remaining memory is zeroed when mapped, and
* writes to that region are not written out to the file."
*/
- if (len < PAGE_SIZE)
- zero_user_segment(page, len, PAGE_SIZE);
+ if (len < folio_size(folio))
+ folio_zero_segment(folio, len, folio_size(folio));
/*
* In the first loop we prepare and mark buffers to submit. We have to
- * mark all buffers in the page before submitting so that
- * end_page_writeback() cannot be called from ext4_end_bio() when IO
+ * mark all buffers in the folio before submitting so that
+ * folio_end_writeback() cannot be called from ext4_end_bio() when IO
* on the first buffer finishes and we are still working on submitting
* the second buffer.
*/
- bh = head = page_buffers(page);
+ bh = head = folio_buffers(folio);
do {
block_start = bh_offset(bh);
if (block_start >= len) {
@@ -479,14 +477,14 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
clear_buffer_dirty(bh);
/*
* Keeping dirty some buffer we cannot write? Make sure
- * to redirty the page and keep TOWRITE tag so that
- * racing WB_SYNC_ALL writeback does not skip the page.
+ * to redirty the folio and keep TOWRITE tag so that
+ * racing WB_SYNC_ALL writeback does not skip the folio.
* This happens e.g. when doing writeout for
* transaction commit.
*/
if (buffer_dirty(bh)) {
- if (!PageDirty(page))
- redirty_page_for_writepage(wbc, page);
+ if (!folio_test_dirty(folio))
+ folio_redirty_for_writepage(wbc, folio);
keep_towrite = true;
}
continue;
@@ -498,11 +496,11 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
nr_to_submit++;
} while ((bh = bh->b_this_page) != head);

- /* Nothing to submit? Just unlock the page... */
+ /* Nothing to submit? Just unlock the folio... */
if (!nr_to_submit)
goto unlock;

- bh = head = page_buffers(page);
+ bh = head = folio_buffers(folio);

/*
* If any blocks are being written to an encrypted file, encrypt them
@@ -514,6 +512,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
if (fscrypt_inode_uses_fs_layer_crypto(inode) && nr_to_submit) {
gfp_t gfp_flags = GFP_NOFS;
unsigned int enc_bytes = round_up(len, i_blocksize(inode));
+ struct page *bounce_page;

/*
* Since bounce page allocation uses a mempool, we can only use
@@ -540,7 +539,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
}

printk_ratelimited(KERN_ERR "%s: ret = %d\n", __func__, ret);
- redirty_page_for_writepage(wbc, page);
+ folio_redirty_for_writepage(wbc, folio);
do {
if (buffer_async_write(bh)) {
clear_buffer_async_write(bh);
@@ -550,21 +549,18 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
} while (bh != head);
goto unlock;
}
+ io_folio = page_folio(bounce_page);
}

- if (keep_towrite)
- set_page_writeback_keepwrite(page);
- else
- set_page_writeback(page);
+ __folio_start_writeback(folio, keep_towrite);

/* Now submit buffers to write */
do {
if (!buffer_async_write(bh))
continue;
- io_submit_add_bh(io, inode,
- bounce_page ? bounce_page : page, bh);
+ io_submit_add_bh(io, inode, io_folio, bh);
} while ((bh = bh->b_this_page) != head);
unlock:
- unlock_page(page);
+ folio_unlock(folio);
return ret;
}
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 0425f22a9c82..bba2a32031a2 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -766,11 +766,6 @@ bool set_page_writeback(struct page *page);
#define folio_start_writeback_keepwrite(folio) \
__folio_start_writeback(folio, true)

-static inline void set_page_writeback_keepwrite(struct page *page)
-{
- folio_start_writeback_keepwrite(page_folio(page));
-}
-
static inline bool test_set_page_writeback(struct page *page)
{
return set_page_writeback(page);
--
2.35.1


2023-01-26 20:25:17

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio

Prepare for multi-page folios and save some instructions by converting
to the folio API.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/ext4/inode.c | 29 ++++++++++++++---------------
1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index b8b3e2e0d9fd..8e3d2cca1e0c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2027,26 +2027,25 @@ static int ext4_writepage(struct page *page,

trace_ext4_writepage(page);
size = i_size_read(inode);
- if (page->index == size >> PAGE_SHIFT &&
+ len = folio_size(folio);
+ if (folio_pos(folio) + len > size &&
!ext4_verity_in_progress(inode))
- len = size & ~PAGE_MASK;
- else
- len = PAGE_SIZE;
+ len = size - folio_pos(folio);

+ page_bufs = folio_buffers(folio);
/* Should never happen but for bugs in other kernel subsystems */
- if (!page_has_buffers(page)) {
+ if (!page_bufs) {
ext4_warning_inode(inode,
- "page %lu does not have buffers attached", page->index);
- ClearPageDirty(page);
- unlock_page(page);
+ "page %lu does not have buffers attached", folio->index);
+ folio_clear_dirty(folio);
+ folio_unlock(folio);
return 0;
}

- page_bufs = page_buffers(page);
/*
* We cannot do block allocation or other extent handling in this
* function. If there are buffers needing that, we have to redirty
- * the page. But we may reach here when we do a journal commit via
+ * the folio. But we may reach here when we do a journal commit via
* journal_submit_inode_data_buffers() and in that case we must write
* allocated buffers to achieve data=ordered mode guarantees.
*
@@ -2062,7 +2061,7 @@ static int ext4_writepage(struct page *page,
*/
if (ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len, NULL,
ext4_bh_delay_or_unwritten)) {
- redirty_page_for_writepage(wbc, page);
+ folio_redirty_for_writepage(wbc, folio);
if ((current->flags & PF_MEMALLOC) ||
(inode->i_sb->s_blocksize == PAGE_SIZE)) {
/*
@@ -2072,12 +2071,12 @@ static int ext4_writepage(struct page *page,
*/
WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD))
== PF_MEMALLOC);
- unlock_page(page);
+ folio_unlock(folio);
return 0;
}
}

- if (PageChecked(page) && ext4_should_journal_data(inode))
+ if (folio_test_checked(folio) && ext4_should_journal_data(inode))
/*
* It's mmapped pagecache. Add buffers and journal it. There
* doesn't seem much point in redirtying the page here.
@@ -2087,8 +2086,8 @@ static int ext4_writepage(struct page *page,
ext4_io_submit_init(&io_submit, wbc);
io_submit.io_end = ext4_init_io_end(inode, GFP_NOFS);
if (!io_submit.io_end) {
- redirty_page_for_writepage(wbc, page);
- unlock_page(page);
+ folio_redirty_for_writepage(wbc, folio);
+ folio_unlock(folio);
return -ENOMEM;
}
ret = ext4_bio_write_page(&io_submit, page, len);
--
2.35.1


2023-01-26 20:25:22

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios

Prepare ext4 to support large folios in the page writeback path.
Also set the actual error in the mapping, not just -EIO.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/ext4/page-io.c | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 982791050892..fd6c0dca24b9 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -99,30 +99,30 @@ static void buffer_io_error(struct buffer_head *bh)

static void ext4_finish_bio(struct bio *bio)
{
- struct bio_vec *bvec;
- struct bvec_iter_all iter_all;
+ struct folio_iter fi;

- bio_for_each_segment_all(bvec, bio, iter_all) {
- struct page *page = bvec->bv_page;
- struct page *bounce_page = NULL;
+ bio_for_each_folio_all(fi, bio) {
+ struct folio *folio = fi.folio;
+ struct folio *io_folio = NULL;
struct buffer_head *bh, *head;
- unsigned bio_start = bvec->bv_offset;
- unsigned bio_end = bio_start + bvec->bv_len;
+ size_t bio_start = fi.offset;
+ size_t bio_end = bio_start + fi.length;
unsigned under_io = 0;
unsigned long flags;

- if (fscrypt_is_bounce_page(page)) {
- bounce_page = page;
- page = fscrypt_pagecache_page(bounce_page);
+ if (fscrypt_is_bounce_folio(folio)) {
+ io_folio = folio;
+ folio = fscrypt_pagecache_folio(folio);
}

if (bio->bi_status) {
- SetPageError(page);
- mapping_set_error(page->mapping, -EIO);
+ int err = blk_status_to_errno(bio->bi_status);
+ folio_set_error(folio);
+ mapping_set_error(folio->mapping, err);
}
- bh = head = page_buffers(page);
+ bh = head = folio_buffers(folio);
/*
- * We check all buffers in the page under b_uptodate_lock
+ * We check all buffers in the folio under b_uptodate_lock
* to avoid races with other end io clearing async_write flags
*/
spin_lock_irqsave(&head->b_uptodate_lock, flags);
@@ -141,8 +141,8 @@ static void ext4_finish_bio(struct bio *bio)
} while ((bh = bh->b_this_page) != head);
spin_unlock_irqrestore(&head->b_uptodate_lock, flags);
if (!under_io) {
- fscrypt_free_bounce_page(bounce_page);
- end_page_writeback(page);
+ fscrypt_free_bounce_page(&io_folio->page);
+ folio_end_writeback(folio);
}
}
}
--
2.35.1


2023-01-26 20:25:31

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 07/31] ext4: Convert mpage_submit_page() to mpage_submit_folio()

All callers now have a folio so we can pass one in and use the folio
APIs to support large folios as well as save instructions by eliminating
calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/ext4/inode.c | 29 ++++++++++++++---------------
1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index e8f2918fd854..8b91e325492f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2097,34 +2097,33 @@ static int ext4_writepage(struct page *page,
return ret;
}

-static int mpage_submit_page(struct mpage_da_data *mpd, struct page *page)
+static int mpage_submit_folio(struct mpage_da_data *mpd, struct folio *folio)
{
- int len;
+ size_t len;
loff_t size;
int err;

- BUG_ON(page->index != mpd->first_page);
- clear_page_dirty_for_io(page);
+ BUG_ON(folio->index != mpd->first_page);
+ folio_clear_dirty_for_io(folio);
/*
* We have to be very careful here! Nothing protects writeback path
* against i_size changes and the page can be writeably mapped into
* page tables. So an application can be growing i_size and writing
- * data through mmap while writeback runs. clear_page_dirty_for_io()
+ * data through mmap while writeback runs. folio_clear_dirty_for_io()
* write-protects our page in page tables and the page cannot get
- * written to again until we release page lock. So only after
- * clear_page_dirty_for_io() we are safe to sample i_size for
+ * written to again until we release folio lock. So only after
+ * folio_clear_dirty_for_io() we are safe to sample i_size for
* ext4_bio_write_page() to zero-out tail of the written page. We rely
* on the barrier provided by TestClearPageDirty in
- * clear_page_dirty_for_io() to make sure i_size is really sampled only
+ * folio_clear_dirty_for_io() to make sure i_size is really sampled only
* after page tables are updated.
*/
size = i_size_read(mpd->inode);
- if (page->index == size >> PAGE_SHIFT &&
+ len = folio_size(folio);
+ if (folio_pos(folio) + len > size &&
!ext4_verity_in_progress(mpd->inode))
len = size & ~PAGE_MASK;
- else
- len = PAGE_SIZE;
- err = ext4_bio_write_page(&mpd->io_submit, page, len);
+ err = ext4_bio_write_page(&mpd->io_submit, &folio->page, len);
if (!err)
mpd->wbc->nr_to_write--;
mpd->first_page++;
@@ -2238,7 +2237,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
} while (lblk++, (bh = bh->b_this_page) != head);
/* So far everything mapped? Submit the page for IO. */
if (mpd->map.m_len == 0) {
- err = mpage_submit_page(mpd, head->b_page);
+ err = mpage_submit_folio(mpd, head->b_folio);
if (err < 0)
return err;
}
@@ -2370,7 +2369,7 @@ static int mpage_map_and_submit_buffers(struct mpage_da_data *mpd)
if (err < 0 || map_bh)
goto out;
/* Page fully mapped - let IO run! */
- err = mpage_submit_page(mpd, &folio->page);
+ err = mpage_submit_folio(mpd, folio);
if (err < 0)
goto out;
}
@@ -2680,7 +2679,7 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
*/
if (!mpd->can_map) {
if (ext4_page_nomap_can_writeout(&folio->page)) {
- err = mpage_submit_page(mpd, &folio->page);
+ err = mpage_submit_folio(mpd, folio);
if (err < 0)
goto out;
} else {
--
2.35.1


2023-01-26 20:25:50

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 09/31] ext4: Convert ext4_readpage_inline() to take a folio

Use the folio API in this function, saves a few calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/ext4/ext4.h | 2 +-
fs/ext4/inline.c | 14 +++++++-------
fs/ext4/inode.c | 2 +-
3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 7a132e8648f4..d2998800855c 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3549,7 +3549,7 @@ extern int ext4_init_inline_data(handle_t *handle, struct inode *inode,
unsigned int len);
extern int ext4_destroy_inline_data(handle_t *handle, struct inode *inode);

-extern int ext4_readpage_inline(struct inode *inode, struct page *page);
+int ext4_readpage_inline(struct inode *inode, struct folio *folio);
extern int ext4_try_to_write_inline_data(struct address_space *mapping,
struct inode *inode,
loff_t pos, unsigned len,
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 2b42ececa46d..38f6282cc012 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -502,7 +502,7 @@ static int ext4_read_inline_page(struct inode *inode, struct page *page)
return ret;
}

-int ext4_readpage_inline(struct inode *inode, struct page *page)
+int ext4_readpage_inline(struct inode *inode, struct folio *folio)
{
int ret = 0;

@@ -516,16 +516,16 @@ int ext4_readpage_inline(struct inode *inode, struct page *page)
* Current inline data can only exist in the 1st page,
* So for all the other pages, just set them uptodate.
*/
- if (!page->index)
- ret = ext4_read_inline_page(inode, page);
- else if (!PageUptodate(page)) {
- zero_user_segment(page, 0, PAGE_SIZE);
- SetPageUptodate(page);
+ if (!folio->index)
+ ret = ext4_read_inline_page(inode, &folio->page);
+ else if (!folio_test_uptodate(folio)) {
+ folio_zero_segment(folio, 0, PAGE_SIZE);
+ folio_mark_uptodate(folio);
}

up_read(&EXT4_I(inode)->xattr_sem);

- unlock_page(page);
+ folio_unlock(folio);
return ret >= 0 ? 0 : ret;
}

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index fcd904123384..c627686295e0 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3300,7 +3300,7 @@ static int ext4_read_folio(struct file *file, struct folio *folio)
trace_ext4_readpage(page);

if (ext4_has_inline_data(inode))
- ret = ext4_readpage_inline(inode, page);
+ ret = ext4_readpage_inline(inode, folio);

if (ret == -EAGAIN)
return ext4_mpage_readpages(inode, NULL, page);
--
2.35.1


2023-01-26 20:25:51

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 08/31] ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio()

Both callers now have a folio so pass it in directly and avoid the call
to page_folio() at the beginning.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/ext4/ext4.h | 5 ++---
fs/ext4/inode.c | 18 +++++++++---------
fs/ext4/page-io.c | 10 ++++------
3 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 43e26e6f6e42..7a132e8648f4 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3756,9 +3756,8 @@ extern void ext4_io_submit_init(struct ext4_io_submit *io,
struct writeback_control *wbc);
extern void ext4_end_io_rsv_work(struct work_struct *work);
extern void ext4_io_submit(struct ext4_io_submit *io);
-extern int ext4_bio_write_page(struct ext4_io_submit *io,
- struct page *page,
- int len);
+int ext4_bio_write_folio(struct ext4_io_submit *io, struct folio *page,
+ size_t len);
extern struct ext4_io_end_vec *ext4_alloc_io_end_vec(ext4_io_end_t *io_end);
extern struct ext4_io_end_vec *ext4_last_io_end_vec(ext4_io_end_t *io_end);

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8b91e325492f..fcd904123384 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2014,9 +2014,9 @@ static int ext4_writepage(struct page *page,
struct folio *folio = page_folio(page);
int ret = 0;
loff_t size;
- unsigned int len;
+ size_t len;
struct buffer_head *page_bufs = NULL;
- struct inode *inode = page->mapping->host;
+ struct inode *inode = folio->mapping->host;
struct ext4_io_submit io_submit;

if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) {
@@ -2052,12 +2052,12 @@ static int ext4_writepage(struct page *page,
* Also, if there is only one buffer per page (the fs block
* size == the page size), if one buffer needs block
* allocation or needs to modify the extent tree to clear the
- * unwritten flag, we know that the page can't be written at
+ * unwritten flag, we know that the folio can't be written at
* all, so we might as well refuse the write immediately.
* Unfortunately if the block size != page size, we can't as
* easily detect this case using ext4_walk_page_buffers(), but
* for the extremely common case, this is an optimization that
- * skips a useless round trip through ext4_bio_write_page().
+ * skips a useless round trip through ext4_bio_write_folio().
*/
if (ext4_walk_page_buffers(NULL, inode, page_bufs, 0, len, NULL,
ext4_bh_delay_or_unwritten)) {
@@ -2079,7 +2079,7 @@ static int ext4_writepage(struct page *page,
if (folio_test_checked(folio) && ext4_should_journal_data(inode))
/*
* It's mmapped pagecache. Add buffers and journal it. There
- * doesn't seem much point in redirtying the page here.
+ * doesn't seem much point in redirtying the folio here.
*/
return __ext4_journalled_writepage(page, len);

@@ -2090,7 +2090,7 @@ static int ext4_writepage(struct page *page,
folio_unlock(folio);
return -ENOMEM;
}
- ret = ext4_bio_write_page(&io_submit, page, len);
+ ret = ext4_bio_write_folio(&io_submit, folio, len);
ext4_io_submit(&io_submit);
/* Drop io_end reference we got from init */
ext4_put_io_end_defer(io_submit.io_end);
@@ -2113,8 +2113,8 @@ static int mpage_submit_folio(struct mpage_da_data *mpd, struct folio *folio)
* write-protects our page in page tables and the page cannot get
* written to again until we release folio lock. So only after
* folio_clear_dirty_for_io() we are safe to sample i_size for
- * ext4_bio_write_page() to zero-out tail of the written page. We rely
- * on the barrier provided by TestClearPageDirty in
+ * ext4_bio_write_folio() to zero-out tail of the written page. We rely
+ * on the barrier provided by folio_test_clear_dirty() in
* folio_clear_dirty_for_io() to make sure i_size is really sampled only
* after page tables are updated.
*/
@@ -2123,7 +2123,7 @@ static int mpage_submit_folio(struct mpage_da_data *mpd, struct folio *folio)
if (folio_pos(folio) + len > size &&
!ext4_verity_in_progress(mpd->inode))
len = size & ~PAGE_MASK;
- err = ext4_bio_write_page(&mpd->io_submit, &folio->page, len);
+ err = ext4_bio_write_folio(&mpd->io_submit, folio, len);
if (!err)
mpd->wbc->nr_to_write--;
mpd->first_page++;
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index fd6c0dca24b9..c6da8800a49f 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -425,11 +425,9 @@ static void io_submit_add_bh(struct ext4_io_submit *io,
io->io_next_block++;
}

-int ext4_bio_write_page(struct ext4_io_submit *io,
- struct page *page,
- int len)
+int ext4_bio_write_folio(struct ext4_io_submit *io, struct folio *folio,
+ size_t len)
{
- struct folio *folio = page_folio(page);
struct folio *io_folio = folio;
struct inode *inode = folio->mapping->host;
unsigned block_start;
@@ -522,8 +520,8 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
if (io->io_bio)
gfp_flags = GFP_NOWAIT | __GFP_NOWARN;
retry_encrypt:
- bounce_page = fscrypt_encrypt_pagecache_blocks(page, enc_bytes,
- 0, gfp_flags);
+ bounce_page = fscrypt_encrypt_pagecache_blocks(&folio->page,
+ enc_bytes, 0, gfp_flags);
if (IS_ERR(bounce_page)) {
ret = PTR_ERR(bounce_page);
if (ret == -ENOMEM &&
--
2.35.1


2023-01-28 16:53:51

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio

Hi Matthew,

I love your patch! Yet something to improve:

[auto build test ERROR on next-20230127]
[cannot apply to tytso-ext4/dev xfs-linux/for-next linus/master v6.2-rc5 v6.2-rc4 v6.2-rc3 v6.2-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Matthew-Wilcox-Oracle/fs-Add-FGP_WRITEBEGIN/20230128-150212
patch link: https://lore.kernel.org/r/20230126202415.1682629-4-willy%40infradead.org
patch subject: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio
config: loongarch-randconfig-r001-20230123 (https://download.01.org/0day-ci/archive/20230129/[email protected]/config)
compiler: loongarch64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/f6e4c5cfaf2ef7b8ee6c5354bbbd5f1ee758746f
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Matthew-Wilcox-Oracle/fs-Add-FGP_WRITEBEGIN/20230128-150212
git checkout f6e4c5cfaf2ef7b8ee6c5354bbbd5f1ee758746f
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "bio_add_folio" [fs/ext4/ext4.ko] undefined!

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-01-28 19:08:54

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio

Hi Matthew,

I love your patch! Yet something to improve:

[auto build test ERROR on next-20230127]
[cannot apply to tytso-ext4/dev xfs-linux/for-next linus/master v6.2-rc5 v6.2-rc4 v6.2-rc3 v6.2-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Matthew-Wilcox-Oracle/fs-Add-FGP_WRITEBEGIN/20230128-150212
patch link: https://lore.kernel.org/r/20230126202415.1682629-4-willy%40infradead.org
patch subject: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio
config: x86_64-randconfig-a013-20230123 (https://download.01.org/0day-ci/archive/20230129/[email protected]/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/f6e4c5cfaf2ef7b8ee6c5354bbbd5f1ee758746f
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Matthew-Wilcox-Oracle/fs-Add-FGP_WRITEBEGIN/20230128-150212
git checkout f6e4c5cfaf2ef7b8ee6c5354bbbd5f1ee758746f
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "bio_add_folio" [fs/ext4/ext4.ko] undefined!

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-03-05 11:18:51

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio

"Matthew Wilcox (Oracle)" <[email protected]> writes:

> Remove several calls to compound_head() and the last caller of
> set_page_writeback_keepwrite(), so remove the wrapper too.

Straight forward conversion.
Looks good to me. Please feel free to add -

Reviewed-by: Ritesh Harjani (IBM) <[email protected]>

> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> ---
> fs/ext4/page-io.c | 58 ++++++++++++++++++--------------------
> include/linux/page-flags.h | 5 ----
> 2 files changed, 27 insertions(+), 36 deletions(-)
>
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index beaec6d81074..982791050892 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -409,11 +409,9 @@ static void io_submit_init_bio(struct ext4_io_submit *io,
>
> static void io_submit_add_bh(struct ext4_io_submit *io,
> struct inode *inode,
> - struct page *page,
> + struct folio *folio,
> struct buffer_head *bh)
> {
> - int ret;
> -
> if (io->io_bio && (bh->b_blocknr != io->io_next_block ||
> !fscrypt_mergeable_bio_bh(io->io_bio, bh))) {
> submit_and_retry:
> @@ -421,10 +419,9 @@ static void io_submit_add_bh(struct ext4_io_submit *io,
> }
> if (io->io_bio == NULL)
> io_submit_init_bio(io, bh);
> - ret = bio_add_page(io->io_bio, page, bh->b_size, bh_offset(bh));
> - if (ret != bh->b_size)
> + if (!bio_add_folio(io->io_bio, folio, bh->b_size, bh_offset(bh)))
> goto submit_and_retry;
> - wbc_account_cgroup_owner(io->io_wbc, page, bh->b_size);
> + wbc_account_cgroup_owner(io->io_wbc, &folio->page, bh->b_size);
> io->io_next_block++;
> }
>
> @@ -432,8 +429,9 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> struct page *page,
> int len)
> {
> - struct page *bounce_page = NULL;
> - struct inode *inode = page->mapping->host;
> + struct folio *folio = page_folio(page);
> + struct folio *io_folio = folio;
> + struct inode *inode = folio->mapping->host;
> unsigned block_start;
> struct buffer_head *bh, *head;
> int ret = 0;
> @@ -441,30 +439,30 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> struct writeback_control *wbc = io->io_wbc;
> bool keep_towrite = false;
>
> - BUG_ON(!PageLocked(page));
> - BUG_ON(PageWriteback(page));
> + BUG_ON(!folio_test_locked(folio));
> + BUG_ON(folio_test_writeback(folio));
>
> - ClearPageError(page);
> + folio_clear_error(folio);
>
> /*
> * Comments copied from block_write_full_page:
> *
> - * The page straddles i_size. It must be zeroed out on each and every
> + * The folio straddles i_size. It must be zeroed out on each and every
> * writepage invocation because it may be mmapped. "A file is mapped
> * in multiples of the page size. For a file that is not a multiple of
> * the page size, the remaining memory is zeroed when mapped, and
> * writes to that region are not written out to the file."
> */
> - if (len < PAGE_SIZE)
> - zero_user_segment(page, len, PAGE_SIZE);
> + if (len < folio_size(folio))
> + folio_zero_segment(folio, len, folio_size(folio));
> /*
> * In the first loop we prepare and mark buffers to submit. We have to
> - * mark all buffers in the page before submitting so that
> - * end_page_writeback() cannot be called from ext4_end_bio() when IO
> + * mark all buffers in the folio before submitting so that
> + * folio_end_writeback() cannot be called from ext4_end_bio() when IO
> * on the first buffer finishes and we are still working on submitting
> * the second buffer.
> */
> - bh = head = page_buffers(page);
> + bh = head = folio_buffers(folio);
> do {
> block_start = bh_offset(bh);
> if (block_start >= len) {
> @@ -479,14 +477,14 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> clear_buffer_dirty(bh);
> /*
> * Keeping dirty some buffer we cannot write? Make sure
> - * to redirty the page and keep TOWRITE tag so that
> - * racing WB_SYNC_ALL writeback does not skip the page.
> + * to redirty the folio and keep TOWRITE tag so that
> + * racing WB_SYNC_ALL writeback does not skip the folio.
> * This happens e.g. when doing writeout for
> * transaction commit.
> */
> if (buffer_dirty(bh)) {
> - if (!PageDirty(page))
> - redirty_page_for_writepage(wbc, page);
> + if (!folio_test_dirty(folio))
> + folio_redirty_for_writepage(wbc, folio);
> keep_towrite = true;
> }
> continue;
> @@ -498,11 +496,11 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> nr_to_submit++;
> } while ((bh = bh->b_this_page) != head);
>
> - /* Nothing to submit? Just unlock the page... */
> + /* Nothing to submit? Just unlock the folio... */
> if (!nr_to_submit)
> goto unlock;
>
> - bh = head = page_buffers(page);
> + bh = head = folio_buffers(folio);
>
> /*
> * If any blocks are being written to an encrypted file, encrypt them
> @@ -514,6 +512,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> if (fscrypt_inode_uses_fs_layer_crypto(inode) && nr_to_submit) {
> gfp_t gfp_flags = GFP_NOFS;
> unsigned int enc_bytes = round_up(len, i_blocksize(inode));
> + struct page *bounce_page;
>
> /*
> * Since bounce page allocation uses a mempool, we can only use
> @@ -540,7 +539,7 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> }
>
> printk_ratelimited(KERN_ERR "%s: ret = %d\n", __func__, ret);
> - redirty_page_for_writepage(wbc, page);
> + folio_redirty_for_writepage(wbc, folio);
> do {
> if (buffer_async_write(bh)) {
> clear_buffer_async_write(bh);
> @@ -550,21 +549,18 @@ int ext4_bio_write_page(struct ext4_io_submit *io,
> } while (bh != head);
> goto unlock;
> }
> + io_folio = page_folio(bounce_page);
> }
>
> - if (keep_towrite)
> - set_page_writeback_keepwrite(page);
> - else
> - set_page_writeback(page);
> + __folio_start_writeback(folio, keep_towrite);
>
> /* Now submit buffers to write */
> do {
> if (!buffer_async_write(bh))
> continue;
> - io_submit_add_bh(io, inode,
> - bounce_page ? bounce_page : page, bh);
> + io_submit_add_bh(io, inode, io_folio, bh);
> } while ((bh = bh->b_this_page) != head);
> unlock:
> - unlock_page(page);
> + folio_unlock(folio);
> return ret;
> }
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 0425f22a9c82..bba2a32031a2 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -766,11 +766,6 @@ bool set_page_writeback(struct page *page);
> #define folio_start_writeback_keepwrite(folio) \
> __folio_start_writeback(folio, true)
>
> -static inline void set_page_writeback_keepwrite(struct page *page)
> -{
> - folio_start_writeback_keepwrite(page_folio(page));
> -}
> -
> static inline bool test_set_page_writeback(struct page *page)
> {
> return set_page_writeback(page);
> --
> 2.35.1

2023-03-06 09:11:14

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios

"Matthew Wilcox (Oracle)" <[email protected]> writes:

> Prepare ext4 to support large folios in the page writeback path.

Sure. I am guessing for ext4 to completely support large folio
requires more work like fscrypt bounce page handling doesn't
yet support folios right?

Could you please give a little background on what all would be required
to add large folio support in ext4 buffered I/O path?
(I mean ofcourse other than saying move ext4 to iomap ;))

What I was interested in was, what other components in particular for
e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?

And how should one go about in adding this support? So can we move
ext4's read path to have large folio support to get started?
Have you already identified what all is missing from this path to
convert it?

> Also set the actual error in the mapping, not just -EIO.

Right. I looked at the history and I think it always just had EIO.
I think setting the actual err in mapping_set_error() is the right thing
to do here.

>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>

w.r.t this patch series. I reviewed the mechanical changes & error paths
which converts ext4 ext4_finish_bio() to use folio.

The changes looks good to me from that perspective. Feel free to add -
Reviewed-by: Ritesh Harjani (IBM) <[email protected]>


> ---
> fs/ext4/page-io.c | 32 ++++++++++++++++----------------
> 1 file changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index 982791050892..fd6c0dca24b9 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -99,30 +99,30 @@ static void buffer_io_error(struct buffer_head *bh)
>
> static void ext4_finish_bio(struct bio *bio)
> {
> - struct bio_vec *bvec;
> - struct bvec_iter_all iter_all;
> + struct folio_iter fi;
>
> - bio_for_each_segment_all(bvec, bio, iter_all) {
> - struct page *page = bvec->bv_page;
> - struct page *bounce_page = NULL;
> + bio_for_each_folio_all(fi, bio) {
> + struct folio *folio = fi.folio;
> + struct folio *io_folio = NULL;
> struct buffer_head *bh, *head;
> - unsigned bio_start = bvec->bv_offset;
> - unsigned bio_end = bio_start + bvec->bv_len;
> + size_t bio_start = fi.offset;
> + size_t bio_end = bio_start + fi.length;
> unsigned under_io = 0;
> unsigned long flags;
>
> - if (fscrypt_is_bounce_page(page)) {
> - bounce_page = page;
> - page = fscrypt_pagecache_page(bounce_page);
> + if (fscrypt_is_bounce_folio(folio)) {
> + io_folio = folio;
> + folio = fscrypt_pagecache_folio(folio);
> }
>
> if (bio->bi_status) {
> - SetPageError(page);
> - mapping_set_error(page->mapping, -EIO);
> + int err = blk_status_to_errno(bio->bi_status);
> + folio_set_error(folio);
> + mapping_set_error(folio->mapping, err);
> }
> - bh = head = page_buffers(page);
> + bh = head = folio_buffers(folio);
> /*
> - * We check all buffers in the page under b_uptodate_lock
> + * We check all buffers in the folio under b_uptodate_lock
> * to avoid races with other end io clearing async_write flags
> */
> spin_lock_irqsave(&head->b_uptodate_lock, flags);
> @@ -141,8 +141,8 @@ static void ext4_finish_bio(struct bio *bio)
> } while ((bh = bh->b_this_page) != head);
> spin_unlock_irqrestore(&head->b_uptodate_lock, flags);
> if (!under_io) {
> - fscrypt_free_bounce_page(bounce_page);
> - end_page_writeback(page);
> + fscrypt_free_bounce_page(&io_folio->page);

Could you please help understand what would it take to convert bounce
page in fscrypt to folio?

Today, we allocate 32 bounce pages of order 0 via mempool in
fscrypt_initialize()
<...>
fscrypt_bounce_page_pool =
mempool_create_page_pool(num_prealloc_crypto_pages, 0);
<...>

And IIUC, we might need to add some support for having higher order
pages in the pool so that one can allocate a folio->_folio_order
folio from this pool for bounce page to support large folio.
Is that understanding correct? Your thoughts on this please?

-ritesh

2023-03-06 18:45:28

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio

"Matthew Wilcox (Oracle)" <[email protected]> writes:

> Prepare for multi-page folios and save some instructions by converting
> to the folio API.

Mostly a straight forward change. The changes looks good to me.
Please feel free to add -

Reviewed-by: Ritesh Harjani (IBM) <[email protected]>

In later few patches I see ext4_readpage converted to ext4_read_folio().
I think the reason why we have not changed ext4_writepage() to
ext4_write_folio() is because we anyway would like to get rid of
->writepage ops eventually in future, so no point.
I think there is even patch series from Jan which tries to kill
ext4_writepage() completely.

-ritesh

2023-03-14 22:07:33

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 03/31] ext4: Convert ext4_bio_write_page() to use a folio

On Thu, Jan 26, 2023 at 08:23:47PM +0000, Matthew Wilcox (Oracle) wrote:
> Remove several calls to compound_head() and the last caller of
> set_page_writeback_keepwrite(), so remove the wrapper too.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>

Reviewed-by: Theodore Ts'o <[email protected]>

2023-03-14 22:08:31

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios

On Thu, Jan 26, 2023 at 08:23:48PM +0000, Matthew Wilcox (Oracle) wrote:
> Prepare ext4 to support large folios in the page writeback path.
> Also set the actual error in the mapping, not just -EIO.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>

Reviewed-by: Theodore Ts'o <[email protected]>

2023-03-14 22:27:06

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio

On Tue, Mar 07, 2023 at 12:15:13AM +0530, Ritesh Harjani wrote:
> "Matthew Wilcox (Oracle)" <[email protected]> writes:
>
> > Prepare for multi-page folios and save some instructions by converting
> > to the folio API.
>
> Mostly a straight forward change. The changes looks good to me.
> Please feel free to add -
>
> Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
>
> In later few patches I see ext4_readpage converted to ext4_read_folio().
> I think the reason why we have not changed ext4_writepage() to
> ext4_write_folio() is because we anyway would like to get rid of
> ->writepage ops eventually in future, so no point.
> I think there is even patch series from Jan which tries to kill
> ext4_writepage() completely.

Indeed, Jan's patch series[1] is about to land in the ext4 tree, and
that's going to remove ext4_writepages. The main reason why this
hadn't landed yet was due to some conflicts with some other folio
changes, so you should be able to drop this patch when you rebase this
patch series.

- Ted

[1] https://lore.kernel.org/all/[email protected]/

2023-03-14 22:28:34

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 07/31] ext4: Convert mpage_submit_page() to mpage_submit_folio()

On Thu, Jan 26, 2023 at 08:23:51PM +0000, Matthew Wilcox (Oracle) wrote:
> All callers now have a folio so we can pass one in and use the folio
> APIs to support large folios as well as save instructions by eliminating
> calls to compound_head().
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>

Reviewed-by: Theodore Ts'o <[email protected]>

2023-03-14 22:31:31

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 08/31] ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio()

On Thu, Jan 26, 2023 at 08:23:52PM +0000, Matthew Wilcox (Oracle) wrote:
> Both callers now have a folio so pass it in directly and avoid the call
> to page_folio() at the beginning.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>

The ext4_writepage() changes will need to be dropped when you rebase,
but other than that....

Reviewed-by: Theodore Ts'o <[email protected]>

2023-03-14 22:32:03

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 09/31] ext4: Convert ext4_readpage_inline() to take a folio

On Thu, Jan 26, 2023 at 08:23:53PM +0000, Matthew Wilcox (Oracle) wrote:
> Use the folio API in this function, saves a few calls to compound_head().
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>

Reviewed-by: Theodore Ts'o <[email protected]>

2023-03-15 17:58:19

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 00/31] Convert most of ext4 to folios

I've pushed Jan's data=writeback cleanup patches, which, among other
things completely eliminates ext4_writepage) to the ext4 tree's dev
branch. So when you rebase these patches for the next version of this
series, please base them on

https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev

Thanks!!

- Ted

2023-03-23 03:30:46

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios

On Mon, Mar 06, 2023 at 02:40:55PM +0530, Ritesh Harjani wrote:
> "Matthew Wilcox (Oracle)" <[email protected]> writes:
>
> > Prepare ext4 to support large folios in the page writeback path.
>
> Sure. I am guessing for ext4 to completely support large folio
> requires more work like fscrypt bounce page handling doesn't
> yet support folios right?
>
> Could you please give a little background on what all would be required
> to add large folio support in ext4 buffered I/O path?
> (I mean ofcourse other than saying move ext4 to iomap ;))
>
> What I was interested in was, what other components in particular for
> e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?
>
> And how should one go about in adding this support? So can we move
> ext4's read path to have large folio support to get started?
> Have you already identified what all is missing from this path to
> convert it?

Honestly, I don't know what else needs to be done beyond this patch
series. I can point at some stuff and say "This doesn't work", but in
general, you have to just enable it and see what breaks. A lot of the
buffer_head code is not large-folio safe right now, so that's somewhere
to go and look. Or maybe we "just" convert to iomap, and never bother
fixing the bufferhead code for large folios.

> > Also set the actual error in the mapping, not just -EIO.
>
> Right. I looked at the history and I think it always just had EIO.
> I think setting the actual err in mapping_set_error() is the right thing
> to do here.
>
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
>
> w.r.t this patch series. I reviewed the mechanical changes & error paths
> which converts ext4 ext4_finish_bio() to use folio.
>
> The changes looks good to me from that perspective. Feel free to add -
> Reviewed-by: Ritesh Harjani (IBM) <[email protected]>

Thanks!

2023-03-23 03:32:06

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 05/31] ext4: Convert ext4_writepage() to use a folio

On Tue, Mar 14, 2023 at 06:26:46PM -0400, Theodore Ts'o wrote:
> On Tue, Mar 07, 2023 at 12:15:13AM +0530, Ritesh Harjani wrote:
> > "Matthew Wilcox (Oracle)" <[email protected]> writes:
> >
> > > Prepare for multi-page folios and save some instructions by converting
> > > to the folio API.
> >
> > Mostly a straight forward change. The changes looks good to me.
> > Please feel free to add -
> >
> > Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
> >
> > In later few patches I see ext4_readpage converted to ext4_read_folio().
> > I think the reason why we have not changed ext4_writepage() to
> > ext4_write_folio() is because we anyway would like to get rid of
> > ->writepage ops eventually in future, so no point.
> > I think there is even patch series from Jan which tries to kill
> > ext4_writepage() completely.
>
> Indeed, Jan's patch series[1] is about to land in the ext4 tree, and
> that's going to remove ext4_writepages. The main reason why this
> hadn't landed yet was due to some conflicts with some other folio
> changes, so you should be able to drop this patch when you rebase this
> patch series.

Correct; in the rebase, I ended up just dropping this patch.

2023-03-23 15:01:16

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios

On Thu, Mar 23, 2023 at 03:26:43AM +0000, Matthew Wilcox wrote:
> On Mon, Mar 06, 2023 at 02:40:55PM +0530, Ritesh Harjani wrote:
> > "Matthew Wilcox (Oracle)" <[email protected]> writes:
> >
> > > Prepare ext4 to support large folios in the page writeback path.
> >
> > Sure. I am guessing for ext4 to completely support large folio
> > requires more work like fscrypt bounce page handling doesn't
> > yet support folios right?
> >
> > Could you please give a little background on what all would be required
> > to add large folio support in ext4 buffered I/O path?
> > (I mean ofcourse other than saying move ext4 to iomap ;))
> >
> > What I was interested in was, what other components in particular for
> > e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?
> >
> > And how should one go about in adding this support? So can we move
> > ext4's read path to have large folio support to get started?
> > Have you already identified what all is missing from this path to
> > convert it?
>
> Honestly, I don't know what else needs to be done beyond this patch
> series. I can point at some stuff and say "This doesn't work", but in
> general, you have to just enable it and see what breaks. A lot of the
> buffer_head code is not large-folio safe right now, so that's somewhere
> to go and look. Or maybe we "just" convert to iomap, and never bother
> fixing the bufferhead code for large folios.

Yes. Let's leave bufferheads in the legacy doo-doo-dooooo basement
instead of wasting more time on them. Ideally we'd someday run all the
filesystems through:

bufferheads -> iomap with bufferheads -> iomap with folios -> iomap with
large folios -> retire to somewhere cheaper than Hawaii

--D

> > > Also set the actual error in the mapping, not just -EIO.
> >
> > Right. I looked at the history and I think it always just had EIO.
> > I think setting the actual err in mapping_set_error() is the right thing
> > to do here.
> >
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> >
> > w.r.t this patch series. I reviewed the mechanical changes & error paths
> > which converts ext4 ext4_finish_bio() to use folio.
> >
> > The changes looks good to me from that perspective. Feel free to add -
> > Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
>
> Thanks!

2023-03-23 15:49:49

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios

On Thu, Mar 23, 2023 at 07:51:09AM -0700, Darrick J. Wong wrote:
> On Thu, Mar 23, 2023 at 03:26:43AM +0000, Matthew Wilcox wrote:
> > On Mon, Mar 06, 2023 at 02:40:55PM +0530, Ritesh Harjani wrote:
> > > "Matthew Wilcox (Oracle)" <[email protected]> writes:
> > >
> > > > Prepare ext4 to support large folios in the page writeback path.
> > >
> > > Sure. I am guessing for ext4 to completely support large folio
> > > requires more work like fscrypt bounce page handling doesn't
> > > yet support folios right?
> > >
> > > Could you please give a little background on what all would be required
> > > to add large folio support in ext4 buffered I/O path?
> > > (I mean ofcourse other than saying move ext4 to iomap ;))
> > >
> > > What I was interested in was, what other components in particular for
> > > e.g. fscrypt, fsverity, ext4's xyz component needs large folio support?
> > >
> > > And how should one go about in adding this support? So can we move
> > > ext4's read path to have large folio support to get started?
> > > Have you already identified what all is missing from this path to
> > > convert it?
> >
> > Honestly, I don't know what else needs to be done beyond this patch
> > series. I can point at some stuff and say "This doesn't work", but in
> > general, you have to just enable it and see what breaks. A lot of the
> > buffer_head code is not large-folio safe right now, so that's somewhere
> > to go and look. Or maybe we "just" convert to iomap, and never bother
> > fixing the bufferhead code for large folios.
>
> Yes. Let's leave bufferheads in the legacy doo-doo-dooooo basement
> instead of wasting more time on them. Ideally we'd someday run all the
> filesystems through:
>
> bufferheads -> iomap with bufferheads -> iomap with folios -> iomap with
> large folios -> retire to somewhere cheaper than Hawaii

Places cheaper than Hawaii probably aren't as pretty as Hawaii though :-(

XFS is fine because it uses xfs_buf, but if we don't add support for
large folios to bufferheads, we can't support LBA size > PAGE_SIZE even
to read the superblock. Maybe that's fine ... only filesystems which
don't use sb_bread() get to support LBA size > PAGE_SIZE.

I really want to see a cheaper abstraction for accessing the block device
than BHs. Or xfs_buf for that matter.

2023-03-27 01:04:00

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios

On Thu, Mar 23, 2023 at 03:30:38PM +0000, Matthew Wilcox wrote:
> I really want to see a cheaper abstraction for accessing the block device
> than BHs. Or xfs_buf for that matter.

You literally can just use the bdev page cache using the normal page
cache helpers. It's not quite what most of these file systems expect,
though, especially for the block size < PAGE_SIZE case.

2023-03-27 01:04:17

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 04/31] ext4: Convert ext4_finish_bio() to use folios

On Thu, Mar 23, 2023 at 07:51:09AM -0700, Darrick J. Wong wrote:
> Yes. Let's leave bufferheads in the legacy doo-doo-dooooo basement
> instead of wasting more time on them. Ideally we'd someday run all the
> filesystems through:
>
> bufferheads -> iomap with bufferheads -> iomap with folios -> iomap with
> large folios -> retire to somewhere cheaper than Hawaii

For a lot of the legacy stuff (and with that I don't mean ext4) we'd
really need volunteers to do any work other than typo fixing and
cosmetic cleanups. I suspect just dropping many of them is the only
thing we can do long term.

But even if we do the above for the data path for all file systems
remaining in tree, we still have buffer_heads for metadata. And
I think buffered_heads really are more or less the right abstraction
there anyway. And for these existing file systems we also do not
care about using large folios for metadata caching anyway. The
only nasty part is that these buffer_heads use the same mapping
as all block device access, so we'll need to find a way to use
large folios for the block device mapping for the LBA size > page size
case, while supporting buffer heads for those file systems. Nothing
unsolvable, but a bit tricky.