This patchset converts XFS & iomap to use folios, and gets them to
a state where they can handle multi-page folios. I don't anticipate
needing to touch XFS again until we're at the point where we want to
convert the aops to be type-safe. The patches apply to both current
Linus head and next-20211101. It completes an xfstests run with no
unexpected failures. Most of these patches have been posted before and
I've retained acks/reviews where I thought them reasonable. Some are new.
I'd really like a better name than 'mapping_set_large_folios()'.
mapping_set_multi_page_folios() seems a bit long. mapping_set_mpf()
is a bit obscure.
Jens, I'd really like your ack on patches 2 & 3; I know we discussed
them before.
Matthew Wilcox (Oracle) (21):
fs: Remove FS_THP_SUPPORT
block: Add bio_add_folio()
block: Add bio_for_each_folio_all()
iomap: Convert to_iomap_page to take a folio
iomap: Convert iomap_page_create to take a folio
iomap: Convert iomap_page_release to take a folio
iomap: Convert iomap_releasepage to use a folio
iomap: Add iomap_invalidate_folio
iomap: Pass the iomap_page into iomap_set_range_uptodate
iomap: Convert bio completions to use folios
iomap: Use folio offsets instead of page offsets
iomap: Convert iomap_read_inline_data to take a folio
iomap: Convert readahead and readpage to use a folio
iomap: Convert iomap_page_mkwrite to use a folio
iomap: Convert iomap_write_begin and iomap_write_end to folios
iomap: Convert iomap_write_end_inline to take a folio
iomap,xfs: Convert ->discard_page to ->discard_folio
iomap: Convert iomap_add_to_ioend to take a folio
iomap: Convert iomap_migrate_page to use folios
iomap: Support multi-page folios in invalidatepage
xfs: Support multi-page folios
Documentation/core-api/kernel-api.rst | 1 +
block/bio.c | 22 ++
fs/inode.c | 2 -
fs/iomap/buffered-io.c | 499 +++++++++++++-------------
fs/xfs/xfs_aops.c | 24 +-
fs/xfs/xfs_icache.c | 2 +
include/linux/bio.h | 56 ++-
include/linux/fs.h | 1 -
include/linux/iomap.h | 3 +-
include/linux/pagemap.h | 16 +
mm/shmem.c | 3 +-
11 files changed, 366 insertions(+), 263 deletions(-)
--
2.33.0
Instead of setting a bit in the fs_flags to set a bit in the
address_space, set the bit in the address_space directly.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/inode.c | 2 --
include/linux/fs.h | 1 -
include/linux/pagemap.h | 16 ++++++++++++++++
mm/shmem.c | 3 ++-
4 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c
index ed0cab8a32db..bdfbd5962f2b 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -180,8 +180,6 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
mapping->a_ops = &empty_aops;
mapping->host = inode;
mapping->flags = 0;
- if (sb->s_type->fs_flags & FS_THP_SUPPORT)
- __set_bit(AS_THP_SUPPORT, &mapping->flags);
mapping->wb_err = 0;
atomic_set(&mapping->i_mmap_writable, 0);
#ifdef CONFIG_READ_ONLY_THP_FOR_FS
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0dcb9020a7b3..d6a4eb6cf825 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2515,7 +2515,6 @@ struct file_system_type {
#define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */
#define FS_DISALLOW_NOTIFY_PERM 16 /* Disable fanotify permission events */
#define FS_ALLOW_IDMAP 32 /* FS has been updated to handle vfs idmappings. */
-#define FS_THP_SUPPORT 8192 /* Remove once all fs converted */
#define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */
int (*init_fs_context)(struct fs_context *);
const struct fs_parameter_spec *parameters;
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 013cdc90f5fd..c17058e57aa4 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -126,6 +126,22 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
m->gfp_mask = mask;
}
+/**
+ * mapping_set_large_folios() - Indicate the file supports multi-page folios.
+ * @mapping: The file.
+ *
+ * The filesystem should call this function in its inode constructor to
+ * indicate that the VFS can use multi-page folios to cache the contents
+ * of the file.
+ *
+ * Context: This should not be called while the inode is active as it
+ * is non-atomic.
+ */
+static inline void mapping_set_large_folios(struct address_space *mapping)
+{
+ __set_bit(AS_THP_SUPPORT, &mapping->flags);
+}
+
static inline bool mapping_thp_support(struct address_space *mapping)
{
return test_bit(AS_THP_SUPPORT, &mapping->flags);
diff --git a/mm/shmem.c b/mm/shmem.c
index 17e344e26e73..eb7a898f7b0a 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2304,6 +2304,7 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode
INIT_LIST_HEAD(&info->swaplist);
simple_xattrs_init(&info->xattrs);
cache_no_acl(inode);
+ mapping_set_large_folios(inode->i_mapping);
switch (mode & S_IFMT) {
default:
@@ -3894,7 +3895,7 @@ static struct file_system_type shmem_fs_type = {
.parameters = shmem_fs_parameters,
#endif
.kill_sb = kill_litter_super,
- .fs_flags = FS_USERNS_MOUNT | FS_THP_SUPPORT,
+ .fs_flags = FS_USERNS_MOUNT,
};
int __init shmem_init(void)
--
2.33.0
This is a thin wrapper around bio_add_page(). The main advantage here
is the documentation that stupidly large folios are not supported.
It's not currently possible to allocate stupidly large folios, but if
it ever becomes possible, this function will fail gracefully instead of
doing I/O to the wrong bytes.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
block/bio.c | 22 ++++++++++++++++++++++
include/linux/bio.h | 3 ++-
2 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/block/bio.c b/block/bio.c
index 15ab0d6d1c06..0e911c4fb9f2 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1033,6 +1033,28 @@ int bio_add_page(struct bio *bio, struct page *page,
}
EXPORT_SYMBOL(bio_add_page);
+/**
+ * bio_add_folio - Attempt to add part of a folio to a bio.
+ * @bio: BIO to add to.
+ * @folio: Folio to add.
+ * @len: How many bytes from the folio to add.
+ * @off: First byte in this folio to add.
+ *
+ * Filesystems that use folios can call this function instead of calling
+ * bio_add_page() for each page in the folio. If @off is bigger than
+ * PAGE_SIZE, this function can create a bio_vec that starts in a page
+ * after the bv_page. BIOs do not support folios that are 4GiB or larger.
+ *
+ * Return: Whether the addition was successful.
+ */
+bool bio_add_folio(struct bio *bio, struct folio *folio, size_t len,
+ size_t off)
+{
+ if (len > UINT_MAX || off > UINT_MAX)
+ return 0;
+ return bio_add_page(bio, &folio->page, len, off) > 0;
+}
+
void __bio_release_pages(struct bio *bio, bool mark_dirty)
{
struct bvec_iter_all iter_all;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index fe6bdfbbef66..a783cac49978 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -409,7 +409,8 @@ extern void bio_uninit(struct bio *);
extern void bio_reset(struct bio *);
void bio_chain(struct bio *, struct bio *);
-extern int bio_add_page(struct bio *, struct page *, unsigned int,unsigned int);
+int bio_add_page(struct bio *, struct page *, unsigned len, unsigned off);
+bool bio_add_folio(struct bio *, struct folio *, size_t len, size_t off);
extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
unsigned int, unsigned int);
int bio_add_zone_append_page(struct bio *bio, struct page *page,
--
2.33.0
Allow callers to iterate over each folio instead of each page. The
bio need not have been constructed using folios originally.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
Documentation/core-api/kernel-api.rst | 1 +
include/linux/bio.h | 53 ++++++++++++++++++++++++++-
2 files changed, 53 insertions(+), 1 deletion(-)
diff --git a/Documentation/core-api/kernel-api.rst b/Documentation/core-api/kernel-api.rst
index 2e7186805148..7f0cb604b6ab 100644
--- a/Documentation/core-api/kernel-api.rst
+++ b/Documentation/core-api/kernel-api.rst
@@ -279,6 +279,7 @@ Accounting Framework
Block Devices
=============
+.. kernel-doc:: include/linux/bio.h
.. kernel-doc:: block/blk-core.c
:export:
diff --git a/include/linux/bio.h b/include/linux/bio.h
index a783cac49978..43b252a99334 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -166,7 +166,7 @@ static inline void bio_advance(struct bio *bio, unsigned int nbytes)
*/
#define bio_for_each_bvec_all(bvl, bio, i) \
for (i = 0, bvl = bio_first_bvec_all(bio); \
- i < (bio)->bi_vcnt; i++, bvl++) \
+ i < (bio)->bi_vcnt; i++, bvl++)
#define bio_iter_last(bvec, iter) ((iter).bi_size == (bvec).bv_len)
@@ -260,6 +260,57 @@ static inline struct bio_vec *bio_last_bvec_all(struct bio *bio)
return &bio->bi_io_vec[bio->bi_vcnt - 1];
}
+/**
+ * struct folio_iter - State for iterating all folios in a bio.
+ * @folio: The current folio we're iterating. NULL after the last folio.
+ * @offset: The byte offset within the current folio.
+ * @length: The number of bytes in this iteration (will not cross folio
+ * boundary).
+ */
+struct folio_iter {
+ struct folio *folio;
+ size_t offset;
+ size_t length;
+ /* private: for use by the iterator */
+ size_t _seg_count;
+ int _i;
+};
+
+static inline
+void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
+{
+ struct bio_vec *bvec = bio_first_bvec_all(bio) + i;
+
+ fi->folio = page_folio(bvec->bv_page);
+ fi->offset = bvec->bv_offset +
+ PAGE_SIZE * (bvec->bv_page - &fi->folio->page);
+ fi->_seg_count = bvec->bv_len;
+ fi->length = min(folio_size(fi->folio) - fi->offset, fi->_seg_count);
+ fi->_i = i;
+}
+
+static inline void bio_next_folio(struct folio_iter *fi, struct bio *bio)
+{
+ fi->_seg_count -= fi->length;
+ if (fi->_seg_count) {
+ fi->folio = folio_next(fi->folio);
+ fi->offset = 0;
+ fi->length = min(folio_size(fi->folio), fi->_seg_count);
+ } else if (fi->_i + 1 < bio->bi_vcnt) {
+ bio_first_folio(fi, bio, fi->_i + 1);
+ } else {
+ fi->folio = NULL;
+ }
+}
+
+/**
+ * bio_for_each_folio_all - Iterate over each folio in a bio.
+ * @fi: struct folio_iter which is updated for each folio.
+ * @bio: struct bio to iterate over.
+ */
+#define bio_for_each_folio_all(fi, bio) \
+ for (bio_first_folio(&fi, bio, 0); fi.folio; bio_next_folio(&fi, bio))
+
enum bip_flags {
BIP_BLOCK_INTEGRITY = 1 << 0, /* block layer owns integrity data */
BIP_MAPPED_INTEGRITY = 1 << 1, /* ref tag has been remapped */
--
2.33.0
The big comment about only using a head page can go away now that
it takes a folio argument.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 32 +++++++++++++++-----------------
1 file changed, 15 insertions(+), 17 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 9cc5798423d1..24a2aa69c467 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -22,8 +22,8 @@
#include "../internal.h"
/*
- * Structure allocated for each page or THP when block size < page size
- * to track sub-page uptodate status and I/O completions.
+ * Structure allocated for each folio when block size < folio size
+ * to track sub-folio uptodate status and I/O completions.
*/
struct iomap_page {
atomic_t read_bytes_pending;
@@ -32,17 +32,10 @@ struct iomap_page {
unsigned long uptodate[];
};
-static inline struct iomap_page *to_iomap_page(struct page *page)
+static inline struct iomap_page *to_iomap_page(struct folio *folio)
{
- /*
- * per-block data is stored in the head page. Callers should
- * not be dealing with tail pages, and if they are, they can
- * call thp_head() first.
- */
- VM_BUG_ON_PGFLAGS(PageTail(page), page);
-
- if (page_has_private(page))
- return (struct iomap_page *)page_private(page);
+ if (folio_test_private(folio))
+ return folio_get_private(folio);
return NULL;
}
@@ -51,7 +44,8 @@ static struct bio_set iomap_ioend_bioset;
static struct iomap_page *
iomap_page_create(struct inode *inode, struct page *page)
{
- struct iomap_page *iop = to_iomap_page(page);
+ struct folio *folio = page_folio(page);
+ struct iomap_page *iop = to_iomap_page(folio);
unsigned int nr_blocks = i_blocks_per_page(inode, page);
if (iop || nr_blocks <= 1)
@@ -144,7 +138,8 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
static void
iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
{
- struct iomap_page *iop = to_iomap_page(page);
+ struct folio *folio = page_folio(page);
+ struct iomap_page *iop = to_iomap_page(folio);
struct inode *inode = page->mapping->host;
unsigned first = off >> inode->i_blkbits;
unsigned last = (off + len - 1) >> inode->i_blkbits;
@@ -173,7 +168,8 @@ static void
iomap_read_page_end_io(struct bio_vec *bvec, int error)
{
struct page *page = bvec->bv_page;
- struct iomap_page *iop = to_iomap_page(page);
+ struct folio *folio = page_folio(page);
+ struct iomap_page *iop = to_iomap_page(folio);
if (unlikely(error)) {
ClearPageUptodate(page);
@@ -427,7 +423,8 @@ int
iomap_is_partially_uptodate(struct page *page, unsigned long from,
unsigned long count)
{
- struct iomap_page *iop = to_iomap_page(page);
+ struct folio *folio = page_folio(page);
+ struct iomap_page *iop = to_iomap_page(folio);
struct inode *inode = page->mapping->host;
unsigned len, first, last;
unsigned i;
@@ -1003,7 +1000,8 @@ static void
iomap_finish_page_writeback(struct inode *inode, struct page *page,
int error, unsigned int len)
{
- struct iomap_page *iop = to_iomap_page(page);
+ struct folio *folio = page_folio(page);
+ struct iomap_page *iop = to_iomap_page(folio);
if (error) {
SetPageError(page);
--
2.33.0
On 11/1/21 2:39 PM, Matthew Wilcox (Oracle) wrote:
> This is a thin wrapper around bio_add_page(). The main advantage here
> is the documentation that stupidly large folios are not supported.
> It's not currently possible to allocate stupidly large folios, but if
> it ever becomes possible, this function will fail gracefully instead of
> doing I/O to the wrong bytes.
Might be better with UINT_MAX instead of stupidly here, because then
it immediately makes sense. Can you make a change to that effect?
With that:
Reviewed-by: Jens Axboe <[email protected]>
--
Jens Axboe
This function already assumed it was being passed a head page, so
just formalise that.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 24a2aa69c467..d96c00c1e9e3 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -42,11 +42,10 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio)
static struct bio_set iomap_ioend_bioset;
static struct iomap_page *
-iomap_page_create(struct inode *inode, struct page *page)
+iomap_page_create(struct inode *inode, struct folio *folio)
{
- struct folio *folio = page_folio(page);
struct iomap_page *iop = to_iomap_page(folio);
- unsigned int nr_blocks = i_blocks_per_page(inode, page);
+ unsigned int nr_blocks = i_blocks_per_folio(inode, folio);
if (iop || nr_blocks <= 1)
return iop;
@@ -54,9 +53,9 @@ iomap_page_create(struct inode *inode, struct page *page)
iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)),
GFP_NOFS | __GFP_NOFAIL);
spin_lock_init(&iop->uptodate_lock);
- if (PageUptodate(page))
+ if (folio_test_uptodate(folio))
bitmap_fill(iop->uptodate, nr_blocks);
- attach_page_private(page, iop);
+ folio_attach_private(folio, iop);
return iop;
}
@@ -204,6 +203,7 @@ struct iomap_readpage_ctx {
static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
struct page *page)
{
+ struct folio *folio = page_folio(page);
const struct iomap *iomap = iomap_iter_srcmap(iter);
size_t size = i_size_read(iter->inode) - iomap->offset;
size_t poff = offset_in_page(iomap->offset);
@@ -220,7 +220,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
if (WARN_ON_ONCE(size > iomap->length))
return -EIO;
if (poff > 0)
- iomap_page_create(iter->inode, page);
+ iomap_page_create(iter->inode, folio);
addr = kmap_local_page(page) + poff;
memcpy(addr, iomap->inline_data, size);
@@ -247,6 +247,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
loff_t pos = iter->pos + offset;
loff_t length = iomap_length(iter) - offset;
struct page *page = ctx->cur_page;
+ struct folio *folio = page_folio(page);
struct iomap_page *iop;
loff_t orig_pos = pos;
unsigned poff, plen;
@@ -256,7 +257,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
return min(iomap_read_inline_data(iter, page), length);
/* zero post-eof blocks as the page may be mapped */
- iop = iomap_page_create(iter->inode, page);
+ iop = iomap_page_create(iter->inode, folio);
iomap_adjust_read_range(iter->inode, iop, &pos, length, &poff, &plen);
if (plen == 0)
goto done;
@@ -536,8 +537,9 @@ iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
unsigned len, struct page *page)
{
+ struct folio *folio = page_folio(page);
const struct iomap *srcmap = iomap_iter_srcmap(iter);
- struct iomap_page *iop = iomap_page_create(iter->inode, page);
+ struct iomap_page *iop = iomap_page_create(iter->inode, folio);
loff_t block_size = i_blocksize(iter->inode);
loff_t block_start = round_down(pos, block_size);
loff_t block_end = round_up(pos + len, block_size);
@@ -1287,7 +1289,8 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
struct writeback_control *wbc, struct inode *inode,
struct page *page, u64 end_offset)
{
- struct iomap_page *iop = iomap_page_create(inode, page);
+ struct folio *folio = page_folio(page);
+ struct iomap_page *iop = iomap_page_create(inode, folio);
struct iomap_ioend *ioend, *next;
unsigned len = i_blocksize(inode);
u64 file_offset; /* file offset of page */
--
2.33.0
On 11/1/21 2:39 PM, Matthew Wilcox (Oracle) wrote:
> Allow callers to iterate over each folio instead of each page. The
> bio need not have been constructed using folios originally.
Reviewed-by: Jens Axboe <[email protected]>
--
Jens Axboe
iomap_page_release() was also assuming that it was being passed a
head page.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
fs/iomap/buffered-io.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index d96c00c1e9e3..b8984f39d8b0 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -59,18 +59,18 @@ iomap_page_create(struct inode *inode, struct folio *folio)
return iop;
}
-static void
-iomap_page_release(struct page *page)
+static void iomap_page_release(struct folio *folio)
{
- struct iomap_page *iop = detach_page_private(page);
- unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, page);
+ struct iomap_page *iop = folio_detach_private(folio);
+ struct inode *inode = folio->mapping->host;
+ unsigned int nr_blocks = i_blocks_per_folio(inode, folio);
if (!iop)
return;
WARN_ON_ONCE(atomic_read(&iop->read_bytes_pending));
WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending));
WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) !=
- PageUptodate(page));
+ folio_test_uptodate(folio));
kfree(iop);
}
@@ -451,6 +451,8 @@ EXPORT_SYMBOL_GPL(iomap_is_partially_uptodate);
int
iomap_releasepage(struct page *page, gfp_t gfp_mask)
{
+ struct folio *folio = page_folio(page);
+
trace_iomap_releasepage(page->mapping->host, page_offset(page),
PAGE_SIZE);
@@ -461,7 +463,7 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
*/
if (PageDirty(page) || PageWriteback(page))
return 0;
- iomap_page_release(page);
+ iomap_page_release(folio);
return 1;
}
EXPORT_SYMBOL_GPL(iomap_releasepage);
@@ -469,6 +471,8 @@ EXPORT_SYMBOL_GPL(iomap_releasepage);
void
iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
{
+ struct folio *folio = page_folio(page);
+
trace_iomap_invalidatepage(page->mapping->host, offset, len);
/*
@@ -478,7 +482,7 @@ iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
if (offset == 0 && len == PAGE_SIZE) {
WARN_ON_ONCE(PageWriteback(page));
cancel_dirty_page(page);
- iomap_page_release(page);
+ iomap_page_release(folio);
}
}
EXPORT_SYMBOL_GPL(iomap_invalidatepage);
--
2.33.0
This is an address_space operation, so its argument must remain as a
struct page, but we can use a folio internally.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/iomap/buffered-io.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index b8984f39d8b0..a6b64a1ad468 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -453,15 +453,15 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
{
struct folio *folio = page_folio(page);
- trace_iomap_releasepage(page->mapping->host, page_offset(page),
- PAGE_SIZE);
+ trace_iomap_releasepage(folio->mapping->host, folio_pos(folio),
+ folio_size(folio));
/*
* mm accommodates an old ext3 case where clean pages might not have had
* the dirty bit cleared. Thus, it can send actual dirty pages to
* ->releasepage() via shrink_active_list(); skip those here.
*/
- if (PageDirty(page) || PageWriteback(page))
+ if (folio_test_dirty(folio) || folio_test_writeback(folio))
return 0;
iomap_page_release(folio);
return 1;
--
2.33.0
All but one caller already has the iomap_page, so we can avoid getting
it again.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 32 ++++++++++++++++++--------------
1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e9a60520e769..e171eb2ebc5d 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -134,11 +134,9 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
*lenp = plen;
}
-static void
-iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+static void iomap_iop_set_range_uptodate(struct page *page,
+ struct iomap_page *iop, unsigned off, unsigned len)
{
- struct folio *folio = page_folio(page);
- struct iomap_page *iop = to_iomap_page(folio);
struct inode *inode = page->mapping->host;
unsigned first = off >> inode->i_blkbits;
unsigned last = (off + len - 1) >> inode->i_blkbits;
@@ -151,14 +149,14 @@ iomap_iop_set_range_uptodate(struct page *page, unsigned off, unsigned len)
spin_unlock_irqrestore(&iop->uptodate_lock, flags);
}
-static void
-iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
+static void iomap_set_range_uptodate(struct page *page,
+ struct iomap_page *iop, unsigned off, unsigned len)
{
if (PageError(page))
return;
- if (page_has_private(page))
- iomap_iop_set_range_uptodate(page, off, len);
+ if (iop)
+ iomap_iop_set_range_uptodate(page, iop, off, len);
else
SetPageUptodate(page);
}
@@ -174,7 +172,8 @@ iomap_read_page_end_io(struct bio_vec *bvec, int error)
ClearPageUptodate(page);
SetPageError(page);
} else {
- iomap_set_range_uptodate(page, bvec->bv_offset, bvec->bv_len);
+ iomap_set_range_uptodate(page, iop, bvec->bv_offset,
+ bvec->bv_len);
}
if (!iop || atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending))
@@ -204,6 +203,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
struct page *page)
{
struct folio *folio = page_folio(page);
+ struct iomap_page *iop;
const struct iomap *iomap = iomap_iter_srcmap(iter);
size_t size = i_size_read(iter->inode) - iomap->offset;
size_t poff = offset_in_page(iomap->offset);
@@ -220,13 +220,15 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
if (WARN_ON_ONCE(size > iomap->length))
return -EIO;
if (poff > 0)
- iomap_page_create(iter->inode, folio);
+ iop = iomap_page_create(iter->inode, folio);
+ else
+ iop = to_iomap_page(folio);
addr = kmap_local_page(page) + poff;
memcpy(addr, iomap->inline_data, size);
memset(addr + size, 0, PAGE_SIZE - poff - size);
kunmap_local(addr);
- iomap_set_range_uptodate(page, poff, PAGE_SIZE - poff);
+ iomap_set_range_uptodate(page, iop, poff, PAGE_SIZE - poff);
return PAGE_SIZE - poff;
}
@@ -264,7 +266,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
if (iomap_block_needs_zeroing(iter, pos)) {
zero_user(page, poff, plen);
- iomap_set_range_uptodate(page, poff, plen);
+ iomap_set_range_uptodate(page, iop, poff, plen);
goto done;
}
@@ -578,7 +580,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
if (status)
return status;
}
- iomap_set_range_uptodate(page, poff, plen);
+ iomap_set_range_uptodate(page, iop, poff, plen);
} while ((block_start += plen) < block_end);
return 0;
@@ -653,6 +655,8 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
size_t copied, struct page *page)
{
+ struct folio *folio = page_folio(page);
+ struct iomap_page *iop = to_iomap_page(folio);
flush_dcache_page(page);
/*
@@ -668,7 +672,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
*/
if (unlikely(copied < len && !PageUptodate(page)))
return 0;
- iomap_set_range_uptodate(page, offset_in_page(pos), len);
+ iomap_set_range_uptodate(page, iop, offset_in_page(pos), len);
__set_page_dirty_nobuffers(page);
return copied;
}
--
2.33.0
Use bio_for_each_folio() to iterate over each folio in the bio
instead of iterating over each page.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 50 ++++++++++++++++++------------------------
1 file changed, 21 insertions(+), 29 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e171eb2ebc5d..d519972a11f1 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -161,34 +161,29 @@ static void iomap_set_range_uptodate(struct page *page,
SetPageUptodate(page);
}
-static void
-iomap_read_page_end_io(struct bio_vec *bvec, int error)
+static void iomap_finish_folio_read(struct folio *folio, size_t offset,
+ size_t len, int error)
{
- struct page *page = bvec->bv_page;
- struct folio *folio = page_folio(page);
struct iomap_page *iop = to_iomap_page(folio);
if (unlikely(error)) {
- ClearPageUptodate(page);
- SetPageError(page);
+ folio_clear_uptodate(folio);
+ folio_set_error(folio);
} else {
- iomap_set_range_uptodate(page, iop, bvec->bv_offset,
- bvec->bv_len);
+ iomap_set_range_uptodate(&folio->page, iop, offset, len);
}
- if (!iop || atomic_sub_and_test(bvec->bv_len, &iop->read_bytes_pending))
- unlock_page(page);
+ if (!iop || atomic_sub_and_test(len, &iop->read_bytes_pending))
+ folio_unlock(folio);
}
-static void
-iomap_read_end_io(struct bio *bio)
+static void iomap_read_end_io(struct bio *bio)
{
int error = blk_status_to_errno(bio->bi_status);
- struct bio_vec *bvec;
- struct bvec_iter_all iter_all;
+ struct folio_iter fi;
- bio_for_each_segment_all(bvec, bio, iter_all)
- iomap_read_page_end_io(bvec, error);
+ bio_for_each_folio_all(fi, bio)
+ iomap_finish_folio_read(fi.folio, fi.offset, fi.length, error);
bio_put(bio);
}
@@ -1010,23 +1005,21 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
}
EXPORT_SYMBOL_GPL(iomap_page_mkwrite);
-static void
-iomap_finish_page_writeback(struct inode *inode, struct page *page,
- int error, unsigned int len)
+static void iomap_finish_folio_write(struct inode *inode, struct folio *folio,
+ size_t len, int error)
{
- struct folio *folio = page_folio(page);
struct iomap_page *iop = to_iomap_page(folio);
if (error) {
- SetPageError(page);
+ folio_set_error(folio);
mapping_set_error(inode->i_mapping, error);
}
- WARN_ON_ONCE(i_blocks_per_page(inode, page) > 1 && !iop);
+ WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !iop);
WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) <= 0);
if (!iop || atomic_sub_and_test(len, &iop->write_bytes_pending))
- end_page_writeback(page);
+ folio_end_writeback(folio);
}
/*
@@ -1045,8 +1038,7 @@ iomap_finish_ioend(struct iomap_ioend *ioend, int error)
bool quiet = bio_flagged(bio, BIO_QUIET);
for (bio = &ioend->io_inline_bio; bio; bio = next) {
- struct bio_vec *bv;
- struct bvec_iter_all iter_all;
+ struct folio_iter fi;
/*
* For the last bio, bi_private points to the ioend, so we
@@ -1057,10 +1049,10 @@ iomap_finish_ioend(struct iomap_ioend *ioend, int error)
else
next = bio->bi_private;
- /* walk each page on bio, ending page IO on them */
- bio_for_each_segment_all(bv, bio, iter_all)
- iomap_finish_page_writeback(inode, bv->bv_page, error,
- bv->bv_len);
+ /* walk all folios in bio, ending page IO on them */
+ bio_for_each_folio_all(fi, bio)
+ iomap_finish_folio_write(inode, fi.folio, fi.length,
+ error);
bio_put(bio);
}
/* The ioend has been freed by bio_put() */
--
2.33.0
Keep iomap_invalidatepage around as a wrapper for use in address_space
operations.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/iomap/buffered-io.c | 20 ++++++++++++--------
include/linux/iomap.h | 1 +
2 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index a6b64a1ad468..e9a60520e769 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -468,23 +468,27 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
}
EXPORT_SYMBOL_GPL(iomap_releasepage);
-void
-iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
+void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len)
{
- struct folio *folio = page_folio(page);
-
- trace_iomap_invalidatepage(page->mapping->host, offset, len);
+ trace_iomap_invalidatepage(folio->mapping->host, offset, len);
/*
* If we're invalidating the entire page, clear the dirty state from it
* and release it to avoid unnecessary buildup of the LRU.
*/
- if (offset == 0 && len == PAGE_SIZE) {
- WARN_ON_ONCE(PageWriteback(page));
- cancel_dirty_page(page);
+ if (offset == 0 && len == folio_size(folio)) {
+ WARN_ON_ONCE(folio_test_writeback(folio));
+ folio_cancel_dirty(folio);
iomap_page_release(folio);
}
}
+EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
+
+void iomap_invalidatepage(struct page *page, unsigned int offset,
+ unsigned int len)
+{
+ iomap_invalidate_folio(page_folio(page), offset, len);
+}
EXPORT_SYMBOL_GPL(iomap_invalidatepage);
#ifdef CONFIG_MIGRATION
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 63f4ea4dac9b..91de58ca09fc 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -225,6 +225,7 @@ void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
int iomap_is_partially_uptodate(struct page *page, unsigned long from,
unsigned long count);
int iomap_releasepage(struct page *page, gfp_t gfp_mask);
+void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len);
void iomap_invalidatepage(struct page *page, unsigned int offset,
unsigned int len);
#ifdef CONFIG_MIGRATION
--
2.33.0
Pass a folio around instead of the page, and make sure the offset
is relative to the start of the folio instead of the start of a page.
Also use size_t for offset & length to make it clear that these are byte
counts, and to support >2GB folios in the future.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 79 ++++++++++++++++++++++--------------------
1 file changed, 41 insertions(+), 38 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index d519972a11f1..dea577380215 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -75,18 +75,18 @@ static void iomap_page_release(struct folio *folio)
}
/*
- * Calculate the range inside the page that we actually need to read.
+ * Calculate the range inside the folio that we actually need to read.
*/
-static void
-iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
- loff_t *pos, loff_t length, unsigned *offp, unsigned *lenp)
+static void iomap_adjust_read_range(struct inode *inode, struct folio *folio,
+ loff_t *pos, loff_t length, size_t *offp, size_t *lenp)
{
+ struct iomap_page *iop = to_iomap_page(folio);
loff_t orig_pos = *pos;
loff_t isize = i_size_read(inode);
unsigned block_bits = inode->i_blkbits;
unsigned block_size = (1 << block_bits);
- unsigned poff = offset_in_page(*pos);
- unsigned plen = min_t(loff_t, PAGE_SIZE - poff, length);
+ size_t poff = offset_in_folio(folio, *pos);
+ size_t plen = min_t(loff_t, folio_size(folio) - poff, length);
unsigned first = poff >> block_bits;
unsigned last = (poff + plen - 1) >> block_bits;
@@ -124,7 +124,7 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
* page cache for blocks that are entirely outside of i_size.
*/
if (orig_pos <= isize && orig_pos + length > isize) {
- unsigned end = offset_in_page(isize - 1) >> block_bits;
+ unsigned end = offset_in_folio(folio, isize - 1) >> block_bits;
if (first <= end && last > end)
plen -= (last - end) * block_size;
@@ -134,31 +134,31 @@ iomap_adjust_read_range(struct inode *inode, struct iomap_page *iop,
*lenp = plen;
}
-static void iomap_iop_set_range_uptodate(struct page *page,
- struct iomap_page *iop, unsigned off, unsigned len)
+static void iomap_iop_set_range_uptodate(struct folio *folio,
+ struct iomap_page *iop, size_t off, size_t len)
{
- struct inode *inode = page->mapping->host;
+ struct inode *inode = folio->mapping->host;
unsigned first = off >> inode->i_blkbits;
unsigned last = (off + len - 1) >> inode->i_blkbits;
unsigned long flags;
spin_lock_irqsave(&iop->uptodate_lock, flags);
bitmap_set(iop->uptodate, first, last - first + 1);
- if (bitmap_full(iop->uptodate, i_blocks_per_page(inode, page)))
- SetPageUptodate(page);
+ if (bitmap_full(iop->uptodate, i_blocks_per_folio(inode, folio)))
+ folio_mark_uptodate(folio);
spin_unlock_irqrestore(&iop->uptodate_lock, flags);
}
-static void iomap_set_range_uptodate(struct page *page,
- struct iomap_page *iop, unsigned off, unsigned len)
+static void iomap_set_range_uptodate(struct folio *folio,
+ struct iomap_page *iop, size_t off, size_t len)
{
- if (PageError(page))
+ if (folio_test_error(folio))
return;
if (iop)
- iomap_iop_set_range_uptodate(page, iop, off, len);
+ iomap_iop_set_range_uptodate(folio, iop, off, len);
else
- SetPageUptodate(page);
+ folio_mark_uptodate(folio);
}
static void iomap_finish_folio_read(struct folio *folio, size_t offset,
@@ -170,7 +170,7 @@ static void iomap_finish_folio_read(struct folio *folio, size_t offset,
folio_clear_uptodate(folio);
folio_set_error(folio);
} else {
- iomap_set_range_uptodate(&folio->page, iop, offset, len);
+ iomap_set_range_uptodate(folio, iop, offset, len);
}
if (!iop || atomic_sub_and_test(len, &iop->read_bytes_pending))
@@ -202,6 +202,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
const struct iomap *iomap = iomap_iter_srcmap(iter);
size_t size = i_size_read(iter->inode) - iomap->offset;
size_t poff = offset_in_page(iomap->offset);
+ size_t offset = offset_in_folio(folio, iomap->offset);
void *addr;
if (PageUptodate(page))
@@ -214,7 +215,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
return -EIO;
if (WARN_ON_ONCE(size > iomap->length))
return -EIO;
- if (poff > 0)
+ if (offset > 0)
iop = iomap_page_create(iter->inode, folio);
else
iop = to_iomap_page(folio);
@@ -223,7 +224,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
memcpy(addr, iomap->inline_data, size);
memset(addr + size, 0, PAGE_SIZE - poff - size);
kunmap_local(addr);
- iomap_set_range_uptodate(page, iop, poff, PAGE_SIZE - poff);
+ iomap_set_range_uptodate(folio, iop, offset, PAGE_SIZE - poff);
return PAGE_SIZE - poff;
}
@@ -247,7 +248,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
struct folio *folio = page_folio(page);
struct iomap_page *iop;
loff_t orig_pos = pos;
- unsigned poff, plen;
+ size_t poff, plen;
sector_t sector;
if (iomap->type == IOMAP_INLINE)
@@ -255,13 +256,13 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
/* zero post-eof blocks as the page may be mapped */
iop = iomap_page_create(iter->inode, folio);
- iomap_adjust_read_range(iter->inode, iop, &pos, length, &poff, &plen);
+ iomap_adjust_read_range(iter->inode, folio, &pos, length, &poff, &plen);
if (plen == 0)
goto done;
if (iomap_block_needs_zeroing(iter, pos)) {
- zero_user(page, poff, plen);
- iomap_set_range_uptodate(page, iop, poff, plen);
+ zero_user(&folio->page, poff, plen);
+ iomap_set_range_uptodate(folio, iop, poff, plen);
goto done;
}
@@ -272,7 +273,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
sector = iomap_sector(iomap, pos);
if (!ctx->bio ||
bio_end_sector(ctx->bio) != sector ||
- bio_add_page(ctx->bio, page, plen, poff) != plen) {
+ !bio_add_folio(ctx->bio, folio, plen, poff)) {
gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
gfp_t orig_gfp = gfp;
unsigned int nr_vecs = DIV_ROUND_UP(length, PAGE_SIZE);
@@ -296,8 +297,9 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
ctx->bio->bi_iter.bi_sector = sector;
bio_set_dev(ctx->bio, iomap->bdev);
ctx->bio->bi_end_io = iomap_read_end_io;
- __bio_add_page(ctx->bio, page, plen, poff);
+ bio_add_folio(ctx->bio, folio, plen, poff);
}
+
done:
/*
* Move the caller beyond our range so that it keeps making progress.
@@ -524,9 +526,8 @@ iomap_write_failed(struct inode *inode, loff_t pos, unsigned len)
truncate_pagecache_range(inode, max(pos, i_size), pos + len);
}
-static int
-iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
- unsigned plen, const struct iomap *iomap)
+static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
+ size_t poff, size_t plen, const struct iomap *iomap)
{
struct bio_vec bvec;
struct bio bio;
@@ -535,7 +536,7 @@ iomap_read_page_sync(loff_t block_start, struct page *page, unsigned poff,
bio.bi_opf = REQ_OP_READ;
bio.bi_iter.bi_sector = iomap_sector(iomap, block_start);
bio_set_dev(&bio, iomap->bdev);
- __bio_add_page(&bio, page, plen, poff);
+ bio_add_folio(&bio, folio, plen, poff);
return submit_bio_wait(&bio);
}
@@ -548,14 +549,15 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
loff_t block_size = i_blocksize(iter->inode);
loff_t block_start = round_down(pos, block_size);
loff_t block_end = round_up(pos + len, block_size);
- unsigned from = offset_in_page(pos), to = from + len, poff, plen;
+ size_t from = offset_in_folio(folio, pos), to = from + len;
+ size_t poff, plen;
- if (PageUptodate(page))
+ if (folio_test_uptodate(folio))
return 0;
- ClearPageError(page);
+ folio_clear_error(folio);
do {
- iomap_adjust_read_range(iter->inode, iop, &block_start,
+ iomap_adjust_read_range(iter->inode, folio, &block_start,
block_end - block_start, &poff, &plen);
if (plen == 0)
break;
@@ -568,14 +570,15 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
if (iomap_block_needs_zeroing(iter, block_start)) {
if (WARN_ON_ONCE(iter->flags & IOMAP_UNSHARE))
return -EIO;
- zero_user_segments(page, poff, from, to, poff + plen);
+ zero_user_segments(&folio->page, poff, from, to,
+ poff + plen);
} else {
- int status = iomap_read_page_sync(block_start, page,
+ int status = iomap_read_folio_sync(block_start, folio,
poff, plen, srcmap);
if (status)
return status;
}
- iomap_set_range_uptodate(page, iop, poff, plen);
+ iomap_set_range_uptodate(folio, iop, poff, plen);
} while ((block_start += plen) < block_end);
return 0;
@@ -667,7 +670,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
*/
if (unlikely(copied < len && !PageUptodate(page)))
return 0;
- iomap_set_range_uptodate(page, iop, offset_in_page(pos), len);
+ iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len);
__set_page_dirty_nobuffers(page);
return copied;
}
--
2.33.0
We still only support up to a single page of inline data (at least,
per call to iomap_read_inline_data()), but it can now be written into
the middle of a folio in case we decide to allocate a 16KiB page for
a file that's 8.1KiB in size.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index dea577380215..b5e77d9de4a7 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -195,9 +195,8 @@ struct iomap_readpage_ctx {
};
static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
- struct page *page)
+ struct folio *folio)
{
- struct folio *folio = page_folio(page);
struct iomap_page *iop;
const struct iomap *iomap = iomap_iter_srcmap(iter);
size_t size = i_size_read(iter->inode) - iomap->offset;
@@ -205,7 +204,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
size_t offset = offset_in_folio(folio, iomap->offset);
void *addr;
- if (PageUptodate(page))
+ if (folio_test_uptodate(folio))
return PAGE_SIZE - poff;
if (WARN_ON_ONCE(size > PAGE_SIZE - poff))
@@ -220,7 +219,7 @@ static loff_t iomap_read_inline_data(const struct iomap_iter *iter,
else
iop = to_iomap_page(folio);
- addr = kmap_local_page(page) + poff;
+ addr = kmap_local_folio(folio, offset);
memcpy(addr, iomap->inline_data, size);
memset(addr + size, 0, PAGE_SIZE - poff - size);
kunmap_local(addr);
@@ -252,7 +251,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
sector_t sector;
if (iomap->type == IOMAP_INLINE)
- return min(iomap_read_inline_data(iter, page), length);
+ return min(iomap_read_inline_data(iter, folio), length);
/* zero post-eof blocks as the page may be mapped */
iop = iomap_page_create(iter->inode, folio);
@@ -587,12 +586,13 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
static int iomap_write_begin_inline(const struct iomap_iter *iter,
struct page *page)
{
+ struct folio *folio = page_folio(page);
int ret;
/* needs more work for the tailpacking case; disable for now */
if (WARN_ON_ONCE(iomap_iter_srcmap(iter)->offset != 0))
return -EIO;
- ret = iomap_read_inline_data(iter, page);
+ ret = iomap_read_inline_data(iter, folio);
if (ret < 0)
return ret;
return 0;
--
2.33.0
Handle folios of arbitrary size instead of working in PAGE_SIZE units.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 53 +++++++++++++++++++++---------------------
1 file changed, 26 insertions(+), 27 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index b5e77d9de4a7..3c68ff26cd16 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -188,8 +188,8 @@ static void iomap_read_end_io(struct bio *bio)
}
struct iomap_readpage_ctx {
- struct page *cur_page;
- bool cur_page_in_bio;
+ struct folio *cur_folio;
+ bool cur_folio_in_bio;
struct bio *bio;
struct readahead_control *rac;
};
@@ -243,8 +243,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
const struct iomap *iomap = &iter->iomap;
loff_t pos = iter->pos + offset;
loff_t length = iomap_length(iter) - offset;
- struct page *page = ctx->cur_page;
- struct folio *folio = page_folio(page);
+ struct folio *folio = ctx->cur_folio;
struct iomap_page *iop;
loff_t orig_pos = pos;
size_t poff, plen;
@@ -265,7 +264,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
goto done;
}
- ctx->cur_page_in_bio = true;
+ ctx->cur_folio_in_bio = true;
if (iop)
atomic_add(plen, &iop->read_bytes_pending);
@@ -273,7 +272,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
if (!ctx->bio ||
bio_end_sector(ctx->bio) != sector ||
!bio_add_folio(ctx->bio, folio, plen, poff)) {
- gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL);
+ gfp_t gfp = mapping_gfp_constraint(folio->mapping, GFP_KERNEL);
gfp_t orig_gfp = gfp;
unsigned int nr_vecs = DIV_ROUND_UP(length, PAGE_SIZE);
@@ -312,30 +311,31 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
int
iomap_readpage(struct page *page, const struct iomap_ops *ops)
{
+ struct folio *folio = page_folio(page);
struct iomap_iter iter = {
- .inode = page->mapping->host,
- .pos = page_offset(page),
- .len = PAGE_SIZE,
+ .inode = folio->mapping->host,
+ .pos = folio_pos(folio),
+ .len = folio_size(folio),
};
struct iomap_readpage_ctx ctx = {
- .cur_page = page,
+ .cur_folio = folio,
};
int ret;
- trace_iomap_readpage(page->mapping->host, 1);
+ trace_iomap_readpage(iter.inode, 1);
while ((ret = iomap_iter(&iter, ops)) > 0)
iter.processed = iomap_readpage_iter(&iter, &ctx, 0);
if (ret < 0)
- SetPageError(page);
+ folio_set_error(folio);
if (ctx.bio) {
submit_bio(ctx.bio);
- WARN_ON_ONCE(!ctx.cur_page_in_bio);
+ WARN_ON_ONCE(!ctx.cur_folio_in_bio);
} else {
- WARN_ON_ONCE(ctx.cur_page_in_bio);
- unlock_page(page);
+ WARN_ON_ONCE(ctx.cur_folio_in_bio);
+ folio_unlock(folio);
}
/*
@@ -354,15 +354,15 @@ static loff_t iomap_readahead_iter(const struct iomap_iter *iter,
loff_t done, ret;
for (done = 0; done < length; done += ret) {
- if (ctx->cur_page && offset_in_page(iter->pos + done) == 0) {
- if (!ctx->cur_page_in_bio)
- unlock_page(ctx->cur_page);
- put_page(ctx->cur_page);
- ctx->cur_page = NULL;
+ if (ctx->cur_folio &&
+ offset_in_folio(ctx->cur_folio, iter->pos + done) == 0) {
+ if (!ctx->cur_folio_in_bio)
+ folio_unlock(ctx->cur_folio);
+ ctx->cur_folio = NULL;
}
- if (!ctx->cur_page) {
- ctx->cur_page = readahead_page(ctx->rac);
- ctx->cur_page_in_bio = false;
+ if (!ctx->cur_folio) {
+ ctx->cur_folio = readahead_folio(ctx->rac);
+ ctx->cur_folio_in_bio = false;
}
ret = iomap_readpage_iter(iter, ctx, done);
}
@@ -403,10 +403,9 @@ void iomap_readahead(struct readahead_control *rac, const struct iomap_ops *ops)
if (ctx.bio)
submit_bio(ctx.bio);
- if (ctx.cur_page) {
- if (!ctx.cur_page_in_bio)
- unlock_page(ctx.cur_page);
- put_page(ctx.cur_page);
+ if (ctx.cur_folio) {
+ if (!ctx.cur_folio_in_bio)
+ folio_unlock(ctx.cur_folio);
}
}
EXPORT_SYMBOL_GPL(iomap_readahead);
--
2.33.0
If we write to any page in a folio, we have to mark the entire
folio as dirty, and potentially COW the entire folio, because it'll
all get written back as one unit.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 3c68ff26cd16..b55d947867b1 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -959,21 +959,21 @@ iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero,
}
EXPORT_SYMBOL_GPL(iomap_truncate_page);
-static loff_t iomap_page_mkwrite_iter(struct iomap_iter *iter,
- struct page *page)
+static loff_t iomap_folio_mkwrite_iter(struct iomap_iter *iter,
+ struct folio *folio)
{
loff_t length = iomap_length(iter);
int ret;
if (iter->iomap.flags & IOMAP_F_BUFFER_HEAD) {
- ret = __block_write_begin_int(page, iter->pos, length, NULL,
- &iter->iomap);
+ ret = __block_write_begin_int(&folio->page, iter->pos, length,
+ NULL, &iter->iomap);
if (ret)
return ret;
- block_commit_write(page, 0, length);
+ block_commit_write(&folio->page, 0, length);
} else {
- WARN_ON_ONCE(!PageUptodate(page));
- set_page_dirty(page);
+ WARN_ON_ONCE(!folio_test_uptodate(folio));
+ folio_mark_dirty(folio);
}
return length;
@@ -985,24 +985,24 @@ vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops)
.inode = file_inode(vmf->vma->vm_file),
.flags = IOMAP_WRITE | IOMAP_FAULT,
};
- struct page *page = vmf->page;
+ struct folio *folio = page_folio(vmf->page);
ssize_t ret;
- lock_page(page);
- ret = page_mkwrite_check_truncate(page, iter.inode);
+ folio_lock(folio);
+ ret = folio_mkwrite_check_truncate(folio, iter.inode);
if (ret < 0)
goto out_unlock;
- iter.pos = page_offset(page);
+ iter.pos = folio_pos(folio);
iter.len = ret;
while ((ret = iomap_iter(&iter, ops)) > 0)
- iter.processed = iomap_page_mkwrite_iter(&iter, page);
+ iter.processed = iomap_folio_mkwrite_iter(&iter, folio);
if (ret < 0)
goto out_unlock;
- wait_for_stable_page(page);
+ folio_wait_stable(folio);
return VM_FAULT_LOCKED;
out_unlock:
- unlock_page(page);
+ folio_unlock(folio);
return block_page_mkwrite_return(ret);
}
EXPORT_SYMBOL_GPL(iomap_page_mkwrite);
--
2.33.0
These functions still only work in PAGE_SIZE chunks, but there are
fewer conversions from tail to head pages as a result of this patch.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/iomap/buffered-io.c | 67 ++++++++++++++++++++++--------------------
1 file changed, 35 insertions(+), 32 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index b55d947867b1..6df8fdbb1951 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -539,9 +539,8 @@ static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
}
static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
- unsigned len, struct page *page)
+ size_t len, struct folio *folio)
{
- struct folio *folio = page_folio(page);
const struct iomap *srcmap = iomap_iter_srcmap(iter);
struct iomap_page *iop = iomap_page_create(iter->inode, folio);
loff_t block_size = i_blocksize(iter->inode);
@@ -583,9 +582,8 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
}
static int iomap_write_begin_inline(const struct iomap_iter *iter,
- struct page *page)
+ struct folio *folio)
{
- struct folio *folio = page_folio(page);
int ret;
/* needs more work for the tailpacking case; disable for now */
@@ -598,11 +596,13 @@ static int iomap_write_begin_inline(const struct iomap_iter *iter,
}
static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
- unsigned len, struct page **pagep)
+ size_t len, struct folio **foliop)
{
const struct iomap_page_ops *page_ops = iter->iomap.page_ops;
const struct iomap *srcmap = iomap_iter_srcmap(iter);
+ struct folio *folio;
struct page *page;
+ unsigned fgp = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE | FGP_NOFS;
int status = 0;
BUG_ON(pos + len > iter->iomap.offset + iter->iomap.length);
@@ -618,29 +618,30 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
return status;
}
- page = grab_cache_page_write_begin(iter->inode->i_mapping,
- pos >> PAGE_SHIFT, AOP_FLAG_NOFS);
- if (!page) {
+ folio = __filemap_get_folio(iter->inode->i_mapping, pos >> PAGE_SHIFT,
+ fgp, mapping_gfp_mask(iter->inode->i_mapping));
+ if (!folio) {
status = -ENOMEM;
goto out_no_page;
}
+ page = folio_file_page(folio, pos >> PAGE_SHIFT);
if (srcmap->type == IOMAP_INLINE)
- status = iomap_write_begin_inline(iter, page);
+ status = iomap_write_begin_inline(iter, folio);
else if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
status = __block_write_begin_int(page, pos, len, NULL, srcmap);
else
- status = __iomap_write_begin(iter, pos, len, page);
+ status = __iomap_write_begin(iter, pos, len, folio);
if (unlikely(status))
goto out_unlock;
- *pagep = page;
+ *foliop = folio;
return 0;
out_unlock:
- unlock_page(page);
- put_page(page);
+ folio_unlock(folio);
+ folio_put(folio);
iomap_write_failed(iter->inode, pos, len);
out_no_page:
@@ -650,11 +651,10 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
}
static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
- size_t copied, struct page *page)
+ size_t copied, struct folio *folio)
{
- struct folio *folio = page_folio(page);
struct iomap_page *iop = to_iomap_page(folio);
- flush_dcache_page(page);
+ flush_dcache_folio(folio);
/*
* The blocks that were entirely written will now be uptodate, so we
@@ -667,10 +667,10 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
* non-uptodate page as a zero-length write, and force the caller to
* redo the whole thing.
*/
- if (unlikely(copied < len && !PageUptodate(page)))
+ if (unlikely(copied < len && !folio_test_uptodate(folio)))
return 0;
iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len);
- __set_page_dirty_nobuffers(page);
+ filemap_dirty_folio(inode->i_mapping, folio);
return copied;
}
@@ -694,8 +694,9 @@ static size_t iomap_write_end_inline(const struct iomap_iter *iter,
/* Returns the number of bytes copied. May be 0. Cannot be an errno. */
static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
- size_t copied, struct page *page)
+ size_t copied, struct folio *folio)
{
+ struct page *page = folio_file_page(folio, pos >> PAGE_SHIFT);
const struct iomap_page_ops *page_ops = iter->iomap.page_ops;
const struct iomap *srcmap = iomap_iter_srcmap(iter);
loff_t old_size = iter->inode->i_size;
@@ -707,7 +708,7 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
ret = block_write_end(NULL, iter->inode->i_mapping, pos, len,
copied, page, NULL);
} else {
- ret = __iomap_write_end(iter->inode, pos, len, copied, page);
+ ret = __iomap_write_end(iter->inode, pos, len, copied, folio);
}
/*
@@ -719,13 +720,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
i_size_write(iter->inode, pos + ret);
iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
}
- unlock_page(page);
+ folio_unlock(folio);
if (old_size < pos)
pagecache_isize_extended(iter->inode, old_size, pos);
if (page_ops && page_ops->page_done)
page_ops->page_done(iter->inode, pos, ret, page);
- put_page(page);
+ folio_put(folio);
if (ret < len)
iomap_write_failed(iter->inode, pos, len);
@@ -740,6 +741,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
long status = 0;
do {
+ struct folio *folio;
struct page *page;
unsigned long offset; /* Offset into pagecache page */
unsigned long bytes; /* Bytes to write to page */
@@ -763,16 +765,17 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
break;
}
- status = iomap_write_begin(iter, pos, bytes, &page);
+ status = iomap_write_begin(iter, pos, bytes, &folio);
if (unlikely(status))
break;
+ page = folio_file_page(folio, pos >> PAGE_SHIFT);
if (mapping_writably_mapped(iter->inode->i_mapping))
flush_dcache_page(page);
copied = copy_page_from_iter_atomic(page, offset, bytes, i);
- status = iomap_write_end(iter, pos, bytes, copied, page);
+ status = iomap_write_end(iter, pos, bytes, copied, folio);
if (unlikely(copied != status))
iov_iter_revert(i, copied - status);
@@ -838,13 +841,13 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
do {
unsigned long offset = offset_in_page(pos);
unsigned long bytes = min_t(loff_t, PAGE_SIZE - offset, length);
- struct page *page;
+ struct folio *folio;
- status = iomap_write_begin(iter, pos, bytes, &page);
+ status = iomap_write_begin(iter, pos, bytes, &folio);
if (unlikely(status))
return status;
- status = iomap_write_end(iter, pos, bytes, bytes, page);
+ status = iomap_write_end(iter, pos, bytes, bytes, folio);
if (WARN_ON_ONCE(status == 0))
return -EIO;
@@ -880,19 +883,19 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
{
- struct page *page;
+ struct folio *folio;
int status;
unsigned offset = offset_in_page(pos);
unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);
- status = iomap_write_begin(iter, pos, bytes, &page);
+ status = iomap_write_begin(iter, pos, bytes, &folio);
if (status)
return status;
- zero_user(page, offset, bytes);
- mark_page_accessed(page);
+ zero_user(folio_file_page(folio, pos >> PAGE_SHIFT), offset, bytes);
+ folio_mark_accessed(folio);
- return iomap_write_end(iter, pos, bytes, bytes, page);
+ return iomap_write_end(iter, pos, bytes, bytes, folio);
}
static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
--
2.33.0
This conversion is only safe because iomap only supports writes to inline
data which starts at the beginning of the file.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
---
fs/iomap/buffered-io.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 6df8fdbb1951..6862487f4067 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -675,16 +675,16 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
}
static size_t iomap_write_end_inline(const struct iomap_iter *iter,
- struct page *page, loff_t pos, size_t copied)
+ struct folio *folio, loff_t pos, size_t copied)
{
const struct iomap *iomap = &iter->iomap;
void *addr;
- WARN_ON_ONCE(!PageUptodate(page));
+ WARN_ON_ONCE(!folio_test_uptodate(folio));
BUG_ON(!iomap_inline_data_valid(iomap));
- flush_dcache_page(page);
- addr = kmap_local_page(page) + pos;
+ flush_dcache_folio(folio);
+ addr = kmap_local_folio(folio, pos);
memcpy(iomap_inline_data(iomap, pos), addr, copied);
kunmap_local(addr);
@@ -703,7 +703,7 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
size_t ret;
if (srcmap->type == IOMAP_INLINE) {
- ret = iomap_write_end_inline(iter, page, pos, copied);
+ ret = iomap_write_end_inline(iter, folio, pos, copied);
} else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
ret = block_write_end(NULL, iter->inode->i_mapping, pos, len,
copied, page, NULL);
--
2.33.0
The arguments are still pages for now, but we can use folios internally
and cut out a lot of calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/iomap/buffered-io.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 2436933dfe42..3b93fdfedb72 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -493,19 +493,21 @@ int
iomap_migrate_page(struct address_space *mapping, struct page *newpage,
struct page *page, enum migrate_mode mode)
{
+ struct folio *folio = page_folio(page);
+ struct folio *newfolio = page_folio(newpage);
int ret;
- ret = migrate_page_move_mapping(mapping, newpage, page, 0);
+ ret = folio_migrate_mapping(mapping, newfolio, folio, 0);
if (ret != MIGRATEPAGE_SUCCESS)
return ret;
- if (page_has_private(page))
- attach_page_private(newpage, detach_page_private(page));
+ if (folio_test_private(folio))
+ folio_attach_private(newfolio, folio_detach_private(folio));
if (mode != MIGRATE_SYNC_NO_COPY)
- migrate_page_copy(newpage, page);
+ folio_migrate_copy(newfolio, folio);
else
- migrate_page_states(newpage, page);
+ folio_migrate_flags(newfolio, folio);
return MIGRATEPAGE_SUCCESS;
}
EXPORT_SYMBOL_GPL(iomap_migrate_page);
--
2.33.0
Now that iomap has been converted, XFS is multi-page folio safe.
Indicate to the VFS that it can now create multi-page folios for XFS.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/xfs/xfs_icache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index f2210d927481..804507c82455 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -87,6 +87,7 @@ xfs_inode_alloc(
/* VFS doesn't initialise i_mode or i_state! */
VFS_I(ip)->i_mode = 0;
VFS_I(ip)->i_state = 0;
+ mapping_set_large_folios(VFS_I(ip)->i_mapping);
XFS_STATS_INC(mp, vn_active);
ASSERT(atomic_read(&ip->i_pincount) == 0);
@@ -336,6 +337,7 @@ xfs_reinit_inode(
inode->i_rdev = dev;
inode->i_uid = uid;
inode->i_gid = gid;
+ mapping_set_large_folios(inode->i_mapping);
return error;
}
--
2.33.0
XFS has the only implementation of ->discard_page today, so convert it
to use folios in the same patch as converting the API.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/iomap/buffered-io.c | 4 ++--
fs/xfs/xfs_aops.c | 24 ++++++++++++------------
include/linux/iomap.h | 2 +-
3 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 6862487f4067..c50ae76835ca 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1349,8 +1349,8 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
* won't be affected by I/O completion and we must unlock it
* now.
*/
- if (wpc->ops->discard_page)
- wpc->ops->discard_page(page, file_offset);
+ if (wpc->ops->discard_folio)
+ wpc->ops->discard_folio(page_folio(page), file_offset);
if (!count) {
ClearPageUptodate(page);
unlock_page(page);
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 34fc6148032a..c6c4d07d0d26 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -428,37 +428,37 @@ xfs_prepare_ioend(
* see a ENOSPC in writeback).
*/
static void
-xfs_discard_page(
- struct page *page,
- loff_t fileoff)
+xfs_discard_folio(
+ struct folio *folio,
+ loff_t pos)
{
- struct inode *inode = page->mapping->host;
+ struct inode *inode = folio->mapping->host;
struct xfs_inode *ip = XFS_I(inode);
struct xfs_mount *mp = ip->i_mount;
- unsigned int pageoff = offset_in_page(fileoff);
- xfs_fileoff_t start_fsb = XFS_B_TO_FSBT(mp, fileoff);
- xfs_fileoff_t pageoff_fsb = XFS_B_TO_FSBT(mp, pageoff);
+ size_t offset = offset_in_folio(folio, pos);
+ xfs_fileoff_t start_fsb = XFS_B_TO_FSBT(mp, pos);
+ xfs_fileoff_t pageoff_fsb = XFS_B_TO_FSBT(mp, offset);
int error;
if (xfs_is_shutdown(mp))
goto out_invalidate;
xfs_alert_ratelimited(mp,
- "page discard on page "PTR_FMT", inode 0x%llx, offset %llu.",
- page, ip->i_ino, fileoff);
+ "page discard on page "PTR_FMT", inode 0x%llx, pos %llu.",
+ folio, ip->i_ino, pos);
error = xfs_bmap_punch_delalloc_range(ip, start_fsb,
- i_blocks_per_page(inode, page) - pageoff_fsb);
+ i_blocks_per_folio(inode, folio) - pageoff_fsb);
if (error && !xfs_is_shutdown(mp))
xfs_alert(mp, "page discard unable to remove delalloc mapping.");
out_invalidate:
- iomap_invalidatepage(page, pageoff, PAGE_SIZE - pageoff);
+ iomap_invalidate_folio(folio, offset, folio_size(folio) - offset);
}
static const struct iomap_writeback_ops xfs_writeback_ops = {
.map_blocks = xfs_map_blocks,
.prepare_ioend = xfs_prepare_ioend,
- .discard_page = xfs_discard_page,
+ .discard_folio = xfs_discard_folio,
};
STATIC int
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 91de58ca09fc..1a161314d7e4 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -285,7 +285,7 @@ struct iomap_writeback_ops {
* Optional, allows the file system to discard state on a page where
* we failed to submit any I/O.
*/
- void (*discard_page)(struct page *page, loff_t fileoff);
+ void (*discard_folio)(struct folio *folio, loff_t pos);
};
struct iomap_writepage_ctx {
--
2.33.0
We still iterate one block at a time, but now we call compound_head()
less often. Rename file_offset to pos to fit the rest of the file.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/iomap/buffered-io.c | 100 +++++++++++++++++++----------------------
1 file changed, 47 insertions(+), 53 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index c50ae76835ca..2436933dfe42 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1252,29 +1252,29 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset,
* first; otherwise finish off the current ioend and start another.
*/
static void
-iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
+iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
struct iomap_page *iop, struct iomap_writepage_ctx *wpc,
struct writeback_control *wbc, struct list_head *iolist)
{
- sector_t sector = iomap_sector(&wpc->iomap, offset);
+ sector_t sector = iomap_sector(&wpc->iomap, pos);
unsigned len = i_blocksize(inode);
- unsigned poff = offset & (PAGE_SIZE - 1);
+ size_t poff = offset_in_folio(folio, pos);
- if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, offset, sector)) {
+ if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos, sector)) {
if (wpc->ioend)
list_add(&wpc->ioend->io_list, iolist);
- wpc->ioend = iomap_alloc_ioend(inode, wpc, offset, sector, wbc);
+ wpc->ioend = iomap_alloc_ioend(inode, wpc, pos, sector, wbc);
}
- if (bio_add_page(wpc->ioend->io_bio, page, len, poff) != len) {
+ if (!bio_add_folio(wpc->ioend->io_bio, folio, len, poff)) {
wpc->ioend->io_bio = iomap_chain_bio(wpc->ioend->io_bio);
- __bio_add_page(wpc->ioend->io_bio, page, len, poff);
+ bio_add_folio(wpc->ioend->io_bio, folio, len, poff);
}
if (iop)
atomic_add(len, &iop->write_bytes_pending);
wpc->ioend->io_size += len;
- wbc_account_cgroup_owner(wbc, page, len);
+ wbc_account_cgroup_owner(wbc, &folio->page, len);
}
/*
@@ -1296,45 +1296,43 @@ iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
static int
iomap_writepage_map(struct iomap_writepage_ctx *wpc,
struct writeback_control *wbc, struct inode *inode,
- struct page *page, u64 end_offset)
+ struct folio *folio, loff_t end_pos)
{
- struct folio *folio = page_folio(page);
struct iomap_page *iop = iomap_page_create(inode, folio);
struct iomap_ioend *ioend, *next;
unsigned len = i_blocksize(inode);
- u64 file_offset; /* file offset of page */
+ unsigned nblocks = i_blocks_per_folio(inode, folio);
+ loff_t pos = folio_pos(folio);
int error = 0, count = 0, i;
LIST_HEAD(submit_list);
WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0);
/*
- * Walk through the page to find areas to write back. If we run off the
- * end of the current map or find the current map invalid, grab a new
- * one.
+ * Walk through the folio to find areas to write back. If we
+ * run off the end of the current map or find the current map
+ * invalid, grab a new one.
*/
- for (i = 0, file_offset = page_offset(page);
- i < (PAGE_SIZE >> inode->i_blkbits) && file_offset < end_offset;
- i++, file_offset += len) {
+ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) {
if (iop && !test_bit(i, iop->uptodate))
continue;
- error = wpc->ops->map_blocks(wpc, inode, file_offset);
+ error = wpc->ops->map_blocks(wpc, inode, pos);
if (error)
break;
if (WARN_ON_ONCE(wpc->iomap.type == IOMAP_INLINE))
continue;
if (wpc->iomap.type == IOMAP_HOLE)
continue;
- iomap_add_to_ioend(inode, file_offset, page, iop, wpc, wbc,
+ iomap_add_to_ioend(inode, pos, folio, iop, wpc, wbc,
&submit_list);
count++;
}
WARN_ON_ONCE(!wpc->ioend && !list_empty(&submit_list));
- WARN_ON_ONCE(!PageLocked(page));
- WARN_ON_ONCE(PageWriteback(page));
- WARN_ON_ONCE(PageDirty(page));
+ WARN_ON_ONCE(!folio_test_locked(folio));
+ WARN_ON_ONCE(folio_test_writeback(folio));
+ WARN_ON_ONCE(folio_test_dirty(folio));
/*
* We cannot cancel the ioend directly here on error. We may have
@@ -1350,16 +1348,16 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
* now.
*/
if (wpc->ops->discard_folio)
- wpc->ops->discard_folio(page_folio(page), file_offset);
+ wpc->ops->discard_folio(folio, pos);
if (!count) {
- ClearPageUptodate(page);
- unlock_page(page);
+ folio_clear_uptodate(folio);
+ folio_unlock(folio);
goto done;
}
}
- set_page_writeback(page);
- unlock_page(page);
+ folio_start_writeback(folio);
+ folio_unlock(folio);
/*
* Preserve the original error if there was one; catch
@@ -1380,9 +1378,9 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
* with a partial page truncate on a sub-page block sized filesystem.
*/
if (!count)
- end_page_writeback(page);
+ folio_end_writeback(folio);
done:
- mapping_set_error(page->mapping, error);
+ mapping_set_error(folio->mapping, error);
return error;
}
@@ -1396,16 +1394,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
static int
iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
{
+ struct folio *folio = page_folio(page);
struct iomap_writepage_ctx *wpc = data;
- struct inode *inode = page->mapping->host;
- pgoff_t end_index;
- u64 end_offset;
- loff_t offset;
+ struct inode *inode = folio->mapping->host;
+ loff_t end_pos, isize;
- trace_iomap_writepage(inode, page_offset(page), PAGE_SIZE);
+ trace_iomap_writepage(inode, folio_pos(folio), folio_size(folio));
/*
- * Refuse to write the page out if we're called from reclaim context.
+ * Refuse to write the folio out if we're called from reclaim context.
*
* This avoids stack overflows when called from deeply used stacks in
* random callers for direct reclaim or memcg reclaim. We explicitly
@@ -1419,10 +1416,10 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
goto redirty;
/*
- * Is this page beyond the end of the file?
+ * Is this folio beyond the end of the file?
*
- * The page index is less than the end_index, adjust the end_offset
- * to the highest offset that this page should represent.
+ * The folio index is less than the end_index, adjust the end_pos
+ * to the highest offset that this folio should represent.
* -----------------------------------------------------
* | file mapping | <EOF> |
* -----------------------------------------------------
@@ -1431,11 +1428,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* | desired writeback range | see else |
* ---------------------------------^------------------|
*/
- offset = i_size_read(inode);
- end_index = offset >> PAGE_SHIFT;
- if (page->index < end_index)
- end_offset = (loff_t)(page->index + 1) << PAGE_SHIFT;
- else {
+ isize = i_size_read(inode);
+ end_pos = folio_pos(folio) + folio_size(folio);
+ if (end_pos - 1 >= isize) {
/*
* Check whether the page to write out is beyond or straddles
* i_size or not.
@@ -1447,7 +1442,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* | | Straddles |
* ---------------------------------^-----------|--------|
*/
- unsigned offset_into_page = offset & (PAGE_SIZE - 1);
+ size_t poff = offset_in_folio(folio, isize);
+ pgoff_t end_index = isize >> PAGE_SHIFT;
/*
* Skip the page if it's fully outside i_size, e.g. due to a
@@ -1466,8 +1462,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* checking if the page is totally beyond i_size or if its
* offset is just equal to the EOF.
*/
- if (page->index > end_index ||
- (page->index == end_index && offset_into_page == 0))
+ if (folio->index > end_index ||
+ (folio->index == end_index && poff == 0))
goto redirty;
/*
@@ -1478,17 +1474,15 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* memory is zeroed when mapped, and writes to that region are
* not written out to the file."
*/
- zero_user_segment(page, offset_into_page, PAGE_SIZE);
-
- /* Adjust the end_offset to the end of file */
- end_offset = offset;
+ zero_user_segment(&folio->page, poff, folio_size(folio));
+ end_pos = isize;
}
- return iomap_writepage_map(wpc, wbc, inode, page, end_offset);
+ return iomap_writepage_map(wpc, wbc, inode, folio, end_pos);
redirty:
- redirty_page_for_writepage(wbc, page);
- unlock_page(page);
+ folio_redirty_for_writepage(wbc, folio);
+ folio_unlock(folio);
return 0;
}
--
2.33.0
If we're punching a hole in a multi-page folio, we need to remove the
per-folio iomap data as the folio is about to be split and each page will
need its own. If a dirty folio is only partially-uptodate, the iomap
data contains the information about which blocks cannot be written back,
so assert that a dirty folio is fully uptodate.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
fs/iomap/buffered-io.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 3b93fdfedb72..9d7c91f9ec1d 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -470,13 +470,18 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len)
trace_iomap_invalidatepage(folio->mapping->host, offset, len);
/*
- * If we're invalidating the entire page, clear the dirty state from it
- * and release it to avoid unnecessary buildup of the LRU.
+ * If we're invalidating the entire folio, clear the dirty state
+ * from it and release it to avoid unnecessary buildup of the LRU.
*/
if (offset == 0 && len == folio_size(folio)) {
WARN_ON_ONCE(folio_test_writeback(folio));
folio_cancel_dirty(folio);
iomap_page_release(folio);
+ } else if (folio_test_multi(folio)) {
+ /* Must release the iop so the page can be split */
+ WARN_ON_ONCE(!folio_test_uptodate(folio) &&
+ folio_test_dirty(folio));
+ iomap_page_release(folio);
}
}
EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
--
2.33.0
Hi "Matthew,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on hnaz-mm/master]
[also build test ERROR on axboe-block/for-next linus/master next-20211101]
[cannot apply to xfs-linux/for-next djwong-xfs/djwong-devel v5.15]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Matthew-Wilcox-Oracle/iomap-xfs-folio-patches/20211102-052926
base: https://github.com/hnaz/linux-mm master
config: sparc64-randconfig-r035-20211101 (attached as .config)
compiler: sparc64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/b3cbfa38e55d041252c57ee712d1bbb146a4aee8
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Matthew-Wilcox-Oracle/iomap-xfs-folio-patches/20211102-052926
git checkout b3cbfa38e55d041252c57ee712d1bbb146a4aee8
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=sparc64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
fs/iomap/buffered-io.c: In function '__iomap_write_end':
>> fs/iomap/buffered-io.c:657:9: error: implicit declaration of function 'flush_dcache_folio'; did you mean 'flush_dcache_page'? [-Werror=implicit-function-declaration]
657 | flush_dcache_folio(folio);
| ^~~~~~~~~~~~~~~~~~
| flush_dcache_page
cc1: some warnings being treated as errors
vim +657 fs/iomap/buffered-io.c
652
653 static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
654 size_t copied, struct folio *folio)
655 {
656 struct iomap_page *iop = to_iomap_page(folio);
> 657 flush_dcache_folio(folio);
658
659 /*
660 * The blocks that were entirely written will now be uptodate, so we
661 * don't have to worry about a readpage reading them and overwriting a
662 * partial write. However, if we've encountered a short write and only
663 * partially written into a block, it will not be marked uptodate, so a
664 * readpage might come in and destroy our partial write.
665 *
666 * Do the simplest thing and just treat any short write to a
667 * non-uptodate page as a zero-length write, and force the caller to
668 * redo the whole thing.
669 */
670 if (unlikely(copied < len && !folio_test_uptodate(folio)))
671 return 0;
672 iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len);
673 filemap_dirty_folio(inode->i_mapping, folio);
674 return copied;
675 }
676
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:15PM +0000, Matthew Wilcox (Oracle) wrote:
> This is an address_space operation, so its argument must remain as a
> struct page, but we can use a folio internally.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:11PM +0000, Matthew Wilcox (Oracle) wrote:
> +static inline
> +void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
Please fix the weird prototype formatting here.
Otherwise looks good:
Reviewed-by: Christoph Hellwig <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:16PM +0000, Matthew Wilcox (Oracle) wrote:
> Keep iomap_invalidatepage around as a wrapper for use in address_space
> operations.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:18PM +0000, Matthew Wilcox (Oracle) wrote:
> Use bio_for_each_folio() to iterate over each folio in the bio
> instead of iterating over each page.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> Reviewed-by: Darrick J. Wong <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:19PM +0000, Matthew Wilcox (Oracle) wrote:
> Pass a folio around instead of the page, and make sure the offset
> is relative to the start of the folio instead of the start of a page.
> Also use size_t for offset & length to make it clear that these are byte
> counts, and to support >2GB folios in the future.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> Reviewed-by: Darrick J. Wong <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:24PM +0000, Matthew Wilcox (Oracle) wrote:
> This conversion is only safe because iomap only supports writes to inline
> data which starts at the beginning of the file.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> Reviewed-by: Darrick J. Wong <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:21PM +0000, Matthew Wilcox (Oracle) wrote:
> for (done = 0; done < length; done += ret) {
> - if (ctx->cur_page && offset_in_page(iter->pos + done) == 0) {
> - if (!ctx->cur_page_in_bio)
> - unlock_page(ctx->cur_page);
> - put_page(ctx->cur_page);
> - ctx->cur_page = NULL;
> + if (ctx->cur_folio &&
> + offset_in_folio(ctx->cur_folio, iter->pos + done) == 0) {
> + if (!ctx->cur_folio_in_bio)
> + folio_unlock(ctx->cur_folio);
> + ctx->cur_folio = NULL;
Where did the put_page here disappear to?
> @@ -403,10 +403,9 @@ void iomap_readahead(struct readahead_control *rac, const struct iomap_ops *ops)
>
> if (ctx.bio)
> submit_bio(ctx.bio);
> - if (ctx.cur_page) {
> - if (!ctx.cur_page_in_bio)
> - unlock_page(ctx.cur_page);
> - put_page(ctx.cur_page);
> + if (ctx.cur_folio) {
> + if (!ctx.cur_folio_in_bio)
> + folio_unlock(ctx.cur_folio);
... and here?
On Mon, Nov 01, 2021 at 08:39:25PM +0000, Matthew Wilcox (Oracle) wrote:
> XFS has the only implementation of ->discard_page today, so convert it
> to use folios in the same patch as converting the API.
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:28PM +0000, Matthew Wilcox (Oracle) wrote:
> If we're punching a hole in a multi-page folio, we need to remove the
> per-folio iomap data as the folio is about to be split and each page will
> need its own. If a dirty folio is only partially-uptodate, the iomap
> data contains the information about which blocks cannot be written back,
> so assert that a dirty folio is fully uptodate.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:27PM +0000, Matthew Wilcox (Oracle) wrote:
> The arguments are still pages for now, but we can use folios internally
> and cut out a lot of calls to compound_head().
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:29PM +0000, Matthew Wilcox (Oracle) wrote:
> Now that iomap has been converted, XFS is multi-page folio safe.
> Indicate to the VFS that it can now create multi-page folios for XFS.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Looks good,
Reviewed-by: Christoph Hellwig <[email protected]>
On Mon, Nov 01, 2021 at 08:39:26PM +0000, Matthew Wilcox (Oracle) wrote:
> @@ -1431,11 +1428,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> * | desired writeback range | see else |
> * ---------------------------------^------------------|
> */
> - offset = i_size_read(inode);
> - end_index = offset >> PAGE_SHIFT;
> - if (page->index < end_index)
> - end_offset = (loff_t)(page->index + 1) << PAGE_SHIFT;
> - else {
> + isize = i_size_read(inode);
> + end_pos = folio_pos(folio) + folio_size(folio);
> + if (end_pos - 1 >= isize) {
Looking at the code not part of the context this looks fine. But I
really wonder if this (and also the blocks change above) would be
better off being split into separate, clearly documented patches.
Otherwise looks good:
Reviewed-by: Christoph Hellwig <[email protected]>
On Tue, Nov 02, 2021 at 12:20:47AM -0700, Christoph Hellwig wrote:
> On Mon, Nov 01, 2021 at 08:39:21PM +0000, Matthew Wilcox (Oracle) wrote:
> > for (done = 0; done < length; done += ret) {
> > - if (ctx->cur_page && offset_in_page(iter->pos + done) == 0) {
> > - if (!ctx->cur_page_in_bio)
> > - unlock_page(ctx->cur_page);
> > - put_page(ctx->cur_page);
> > - ctx->cur_page = NULL;
> > + if (ctx->cur_folio &&
> > + offset_in_folio(ctx->cur_folio, iter->pos + done) == 0) {
> > + if (!ctx->cur_folio_in_bio)
> > + folio_unlock(ctx->cur_folio);
> > + ctx->cur_folio = NULL;
>
> Where did the put_page here disappear to?
I'll put that explanation in the changelog:
Handle folios of arbitrary size instead of working in PAGE_SIZE units.
readahead_folio() puts the page for you, so this is not quite a mechanical
change.
---
The reason for making that change is that I messed up when introducing the
readahead() operation. I followed the refcounting rule of ->readpages()
instead of the rule of ->readpage(). For a successful readahead, we have
two more atomic operations than necessary. I want to fix that, and
this seems like a good opportunity to do it. Once all filesystems are
converted to call readahead_folio(), we can remove the extra get_page()
and put_page().
I did put an explanation of that in commit 9bf70167e3c6, but it's not
reasonable to expect reviewers to remember that when reviewing changes
to their filesystem's readahead, so I'll be sure to mention it in any
future conversion's changelogs.
mm/filemap: Add readahead_folio()
The pointers stored in the page cache are folios, by definition.
This change comes with a behaviour change -- callers of readahead_folio()
are no longer required to put the page reference themselves. This matches
how readpage works, rather than matching how readpages used to work.
On Mon, Nov 01, 2021 at 02:51:37PM -0600, Jens Axboe wrote:
> On 11/1/21 2:39 PM, Matthew Wilcox (Oracle) wrote:
> > This is a thin wrapper around bio_add_page(). The main advantage here
> > is the documentation that stupidly large folios are not supported.
> > It's not currently possible to allocate stupidly large folios, but if
> > it ever becomes possible, this function will fail gracefully instead of
> > doing I/O to the wrong bytes.
>
> Might be better with UINT_MAX instead of stupidly here, because then
> it immediately makes sense. Can you make a change to that effect?
I'll make it "that folios larger than 2GiB are not supported. It's not
currently possible to allocate folios that large,"
> With that:
>
> Reviewed-by: Jens Axboe <[email protected]>
>
> --
> Jens Axboe
>
On Tue, Nov 02, 2021 at 12:13:22AM -0700, Christoph Hellwig wrote:
> On Mon, Nov 01, 2021 at 08:39:11PM +0000, Matthew Wilcox (Oracle) wrote:
> > +static inline
> > +void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
>
> Please fix the weird prototype formatting here.
I dunno, it looks weirder this way:
-static inline
-void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
+static inline void bio_first_folio(struct folio_iter *fi, struct bio *bio,
+ int i)
Anyway, I've made that change, but I still prefer it the way I had it.
> Otherwise looks good:
>
> Reviewed-by: Christoph Hellwig <[email protected]>
On Tue, Nov 02, 2021 at 12:26:42AM -0700, Christoph Hellwig wrote:
> Looking at the code not part of the context this looks fine. But I
> really wonder if this (and also the blocks change above) would be
> better off being split into separate, clearly documented patches.
How do these three patches look? I retained your R-b on all three since
I figured the one you offered below was good for all of them.
From ab7cace8f325ca5cc1b1e62e6a8498c84738bc10 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <[email protected]>
Date: Tue, 2 Nov 2021 10:51:55 -0400
Subject: [PATCH 1/3] iomap: Simplify iomap_writepage_map()
Rename end_offset to end_pos and file_offset to pos to match the
rest of the file. Simplify the loop by calculating nblocks
up front instead of each time around the loop.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
fs/iomap/buffered-io.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 8f47879f9f05..e32e3cb2cf86 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1296,37 +1296,36 @@ iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
static int
iomap_writepage_map(struct iomap_writepage_ctx *wpc,
struct writeback_control *wbc, struct inode *inode,
- struct page *page, u64 end_offset)
+ struct page *page, loff_t end_pos)
{
struct folio *folio = page_folio(page);
struct iomap_page *iop = iomap_page_create(inode, folio);
struct iomap_ioend *ioend, *next;
unsigned len = i_blocksize(inode);
- u64 file_offset; /* file offset of page */
+ unsigned nblocks = i_blocks_per_folio(inode, folio);
+ loff_t pos = folio_pos(folio);
int error = 0, count = 0, i;
LIST_HEAD(submit_list);
WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0);
/*
- * Walk through the page to find areas to write back. If we run off the
- * end of the current map or find the current map invalid, grab a new
- * one.
+ * Walk through the folio to find areas to write back. If we
+ * run off the end of the current map or find the current map
+ * invalid, grab a new one.
*/
- for (i = 0, file_offset = page_offset(page);
- i < (PAGE_SIZE >> inode->i_blkbits) && file_offset < end_offset;
- i++, file_offset += len) {
+ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) {
if (iop && !test_bit(i, iop->uptodate))
continue;
- error = wpc->ops->map_blocks(wpc, inode, file_offset);
+ error = wpc->ops->map_blocks(wpc, inode, pos);
if (error)
break;
if (WARN_ON_ONCE(wpc->iomap.type == IOMAP_INLINE))
continue;
if (wpc->iomap.type == IOMAP_HOLE)
continue;
- iomap_add_to_ioend(inode, file_offset, page, iop, wpc, wbc,
+ iomap_add_to_ioend(inode, pos, page, iop, wpc, wbc,
&submit_list);
count++;
}
@@ -1350,7 +1349,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
* now.
*/
if (wpc->ops->discard_folio)
- wpc->ops->discard_folio(page_folio(page), file_offset);
+ wpc->ops->discard_folio(folio, pos);
if (!count) {
ClearPageUptodate(page);
unlock_page(page);
--
2.33.0
From 07c994353e357c3b4252595a80b86e8565deb09c Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <[email protected]>
Date: Tue, 2 Nov 2021 11:41:16 -0400
Subject: [PATCH 2/3] iomap: Simplify iomap_do_writepage()
Rename end_offset to end_pos and offset_into_page to poff to match the
rest of the file. Simplify the handling of the last page straddling
i_size.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
fs/iomap/buffered-io.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e32e3cb2cf86..4f4f33849417 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1397,9 +1397,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
{
struct iomap_writepage_ctx *wpc = data;
struct inode *inode = page->mapping->host;
- pgoff_t end_index;
- u64 end_offset;
- loff_t offset;
+ loff_t end_pos, isize;
trace_iomap_writepage(inode, page_offset(page), PAGE_SIZE);
@@ -1430,11 +1428,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* | desired writeback range | see else |
* ---------------------------------^------------------|
*/
- offset = i_size_read(inode);
- end_index = offset >> PAGE_SHIFT;
- if (page->index < end_index)
- end_offset = (loff_t)(page->index + 1) << PAGE_SHIFT;
- else {
+ isize = i_size_read(inode);
+ end_pos = page_offset(page) + PAGE_SIZE;
+ if (end_pos - 1 >= isize) {
/*
* Check whether the page to write out is beyond or straddles
* i_size or not.
@@ -1446,7 +1442,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* | | Straddles |
* ---------------------------------^-----------|--------|
*/
- unsigned offset_into_page = offset & (PAGE_SIZE - 1);
+ size_t poff = offset_in_page(isize);
+ pgoff_t end_index = isize >> PAGE_SHIFT;
/*
* Skip the page if it's fully outside i_size, e.g. due to a
@@ -1466,7 +1463,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* offset is just equal to the EOF.
*/
if (page->index > end_index ||
- (page->index == end_index && offset_into_page == 0))
+ (page->index == end_index && poff == 0))
goto redirty;
/*
@@ -1477,13 +1474,13 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* memory is zeroed when mapped, and writes to that region are
* not written out to the file."
*/
- zero_user_segment(page, offset_into_page, PAGE_SIZE);
+ zero_user_segment(page, poff, PAGE_SIZE);
/* Adjust the end_offset to the end of file */
- end_offset = offset;
+ end_pos = isize;
}
- return iomap_writepage_map(wpc, wbc, inode, page, end_offset);
+ return iomap_writepage_map(wpc, wbc, inode, page, end_pos);
redirty:
redirty_page_for_writepage(wbc, page);
--
2.33.0
From d5412657a503ae27efb5770fbc1c5c980180c9c4 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <[email protected]>
Date: Tue, 2 Nov 2021 12:45:12 -0400
Subject: [PATCH 3/3] iomap: Convert iomap_add_to_ioend to take a folio
We still iterate one block at a time, but now we call compound_head()
less often.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
---
fs/iomap/buffered-io.c | 70 ++++++++++++++++++++----------------------
1 file changed, 34 insertions(+), 36 deletions(-)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 4f4f33849417..8908368abd49 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1252,29 +1252,29 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset,
* first; otherwise finish off the current ioend and start another.
*/
static void
-iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
+iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
struct iomap_page *iop, struct iomap_writepage_ctx *wpc,
struct writeback_control *wbc, struct list_head *iolist)
{
- sector_t sector = iomap_sector(&wpc->iomap, offset);
+ sector_t sector = iomap_sector(&wpc->iomap, pos);
unsigned len = i_blocksize(inode);
- unsigned poff = offset & (PAGE_SIZE - 1);
+ size_t poff = offset_in_folio(folio, pos);
- if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, offset, sector)) {
+ if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos, sector)) {
if (wpc->ioend)
list_add(&wpc->ioend->io_list, iolist);
- wpc->ioend = iomap_alloc_ioend(inode, wpc, offset, sector, wbc);
+ wpc->ioend = iomap_alloc_ioend(inode, wpc, pos, sector, wbc);
}
- if (bio_add_page(wpc->ioend->io_bio, page, len, poff) != len) {
+ if (!bio_add_folio(wpc->ioend->io_bio, folio, len, poff)) {
wpc->ioend->io_bio = iomap_chain_bio(wpc->ioend->io_bio);
- __bio_add_page(wpc->ioend->io_bio, page, len, poff);
+ bio_add_folio(wpc->ioend->io_bio, folio, len, poff);
}
if (iop)
atomic_add(len, &iop->write_bytes_pending);
wpc->ioend->io_size += len;
- wbc_account_cgroup_owner(wbc, page, len);
+ wbc_account_cgroup_owner(wbc, &folio->page, len);
}
/*
@@ -1296,9 +1296,8 @@ iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
static int
iomap_writepage_map(struct iomap_writepage_ctx *wpc,
struct writeback_control *wbc, struct inode *inode,
- struct page *page, loff_t end_pos)
+ struct folio *folio, loff_t end_pos)
{
- struct folio *folio = page_folio(page);
struct iomap_page *iop = iomap_page_create(inode, folio);
struct iomap_ioend *ioend, *next;
unsigned len = i_blocksize(inode);
@@ -1325,15 +1324,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
continue;
if (wpc->iomap.type == IOMAP_HOLE)
continue;
- iomap_add_to_ioend(inode, pos, page, iop, wpc, wbc,
+ iomap_add_to_ioend(inode, pos, folio, iop, wpc, wbc,
&submit_list);
count++;
}
WARN_ON_ONCE(!wpc->ioend && !list_empty(&submit_list));
- WARN_ON_ONCE(!PageLocked(page));
- WARN_ON_ONCE(PageWriteback(page));
- WARN_ON_ONCE(PageDirty(page));
+ WARN_ON_ONCE(!folio_test_locked(folio));
+ WARN_ON_ONCE(folio_test_writeback(folio));
+ WARN_ON_ONCE(folio_test_dirty(folio));
/*
* We cannot cancel the ioend directly here on error. We may have
@@ -1351,14 +1350,14 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
if (wpc->ops->discard_folio)
wpc->ops->discard_folio(folio, pos);
if (!count) {
- ClearPageUptodate(page);
- unlock_page(page);
+ folio_clear_uptodate(folio);
+ folio_unlock(folio);
goto done;
}
}
- set_page_writeback(page);
- unlock_page(page);
+ folio_start_writeback(folio);
+ folio_unlock(folio);
/*
* Preserve the original error if there was one; catch
@@ -1379,9 +1378,9 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
* with a partial page truncate on a sub-page block sized filesystem.
*/
if (!count)
- end_page_writeback(page);
+ folio_end_writeback(folio);
done:
- mapping_set_error(page->mapping, error);
+ mapping_set_error(folio->mapping, error);
return error;
}
@@ -1395,14 +1394,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
static int
iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
{
+ struct folio *folio = page_folio(page);
struct iomap_writepage_ctx *wpc = data;
- struct inode *inode = page->mapping->host;
+ struct inode *inode = folio->mapping->host;
loff_t end_pos, isize;
- trace_iomap_writepage(inode, page_offset(page), PAGE_SIZE);
+ trace_iomap_writepage(inode, folio_pos(folio), folio_size(folio));
/*
- * Refuse to write the page out if we're called from reclaim context.
+ * Refuse to write the folio out if we're called from reclaim context.
*
* This avoids stack overflows when called from deeply used stacks in
* random callers for direct reclaim or memcg reclaim. We explicitly
@@ -1416,10 +1416,10 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
goto redirty;
/*
- * Is this page beyond the end of the file?
+ * Is this folio beyond the end of the file?
*
- * The page index is less than the end_index, adjust the end_offset
- * to the highest offset that this page should represent.
+ * The folio index is less than the end_index, adjust the end_pos
+ * to the highest offset that this folio should represent.
* -----------------------------------------------------
* | file mapping | <EOF> |
* -----------------------------------------------------
@@ -1429,7 +1429,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* ---------------------------------^------------------|
*/
isize = i_size_read(inode);
- end_pos = page_offset(page) + PAGE_SIZE;
+ end_pos = folio_pos(folio) + folio_size(folio);
if (end_pos - 1 >= isize) {
/*
* Check whether the page to write out is beyond or straddles
@@ -1442,7 +1442,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* | | Straddles |
* ---------------------------------^-----------|--------|
*/
- size_t poff = offset_in_page(isize);
+ size_t poff = offset_in_folio(folio, isize);
pgoff_t end_index = isize >> PAGE_SHIFT;
/*
@@ -1462,8 +1462,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* checking if the page is totally beyond i_size or if its
* offset is just equal to the EOF.
*/
- if (page->index > end_index ||
- (page->index == end_index && poff == 0))
+ if (folio->index > end_index ||
+ (folio->index == end_index && poff == 0))
goto redirty;
/*
@@ -1474,17 +1474,15 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
* memory is zeroed when mapped, and writes to that region are
* not written out to the file."
*/
- zero_user_segment(page, poff, PAGE_SIZE);
-
- /* Adjust the end_offset to the end of file */
+ zero_user_segment(&folio->page, poff, folio_size(folio));
end_pos = isize;
}
- return iomap_writepage_map(wpc, wbc, inode, page, end_pos);
+ return iomap_writepage_map(wpc, wbc, inode, folio, end_pos);
redirty:
- redirty_page_for_writepage(wbc, page);
- unlock_page(page);
+ folio_redirty_for_writepage(wbc, folio);
+ folio_unlock(folio);
return 0;
}
--
2.33.0
On Tue, Nov 02, 2021 at 08:24:14PM +0000, Matthew Wilcox wrote:
> On Tue, Nov 02, 2021 at 12:13:22AM -0700, Christoph Hellwig wrote:
> > On Mon, Nov 01, 2021 at 08:39:11PM +0000, Matthew Wilcox (Oracle) wrote:
> > > +static inline
> > > +void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
> >
> > Please fix the weird prototype formatting here.
>
> I dunno, it looks weirder this way:
>
> -static inline
> -void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
> +static inline void bio_first_folio(struct folio_iter *fi, struct bio *bio,
> + int i)
>
> Anyway, I've made that change, but I still prefer it the way I had it.
I /think/ Christoph meant:
static inline void
bio_first_folio(...)
Though the form that you've changed it to is also fine.
--D
> > Otherwise looks good:
> >
> > Reviewed-by: Christoph Hellwig <[email protected]>
On 11/2/21 4:24 PM, Darrick J. Wong wrote:
> On Tue, Nov 02, 2021 at 08:24:14PM +0000, Matthew Wilcox wrote:
>> On Tue, Nov 02, 2021 at 12:13:22AM -0700, Christoph Hellwig wrote:
>>> On Mon, Nov 01, 2021 at 08:39:11PM +0000, Matthew Wilcox (Oracle) wrote:
>>>> +static inline
>>>> +void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
>>>
>>> Please fix the weird prototype formatting here.
>>
>> I dunno, it looks weirder this way:
>>
>> -static inline
>> -void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
>> +static inline void bio_first_folio(struct folio_iter *fi, struct bio *bio,
>> + int i)
>>
>> Anyway, I've made that change, but I still prefer it the way I had it.
>
> I /think/ Christoph meant:
>
> static inline void
> bio_first_folio(...)
>
> Though the form that you've changed it to is also fine.
I won't speak for Christoph, but basically everything else in block
follows the:
static inline void bio_first_folio(struct folio_iter *fi, struct bio *bio,
int i)
{
}
format, and this one should as well.
--
Jens Axboe
On Mon, Nov 01, 2021 at 08:39:16PM +0000, Matthew Wilcox (Oracle) wrote:
> Keep iomap_invalidatepage around as a wrapper for use in address_space
> operations.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Looks ok,
Reviewed-by: Darrick J. Wong <[email protected]>
--D
> ---
> fs/iomap/buffered-io.c | 20 ++++++++++++--------
> include/linux/iomap.h | 1 +
> 2 files changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index a6b64a1ad468..e9a60520e769 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -468,23 +468,27 @@ iomap_releasepage(struct page *page, gfp_t gfp_mask)
> }
> EXPORT_SYMBOL_GPL(iomap_releasepage);
>
> -void
> -iomap_invalidatepage(struct page *page, unsigned int offset, unsigned int len)
> +void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len)
> {
> - struct folio *folio = page_folio(page);
> -
> - trace_iomap_invalidatepage(page->mapping->host, offset, len);
> + trace_iomap_invalidatepage(folio->mapping->host, offset, len);
>
> /*
> * If we're invalidating the entire page, clear the dirty state from it
> * and release it to avoid unnecessary buildup of the LRU.
> */
> - if (offset == 0 && len == PAGE_SIZE) {
> - WARN_ON_ONCE(PageWriteback(page));
> - cancel_dirty_page(page);
> + if (offset == 0 && len == folio_size(folio)) {
> + WARN_ON_ONCE(folio_test_writeback(folio));
> + folio_cancel_dirty(folio);
> iomap_page_release(folio);
> }
> }
> +EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
> +
> +void iomap_invalidatepage(struct page *page, unsigned int offset,
> + unsigned int len)
> +{
> + iomap_invalidate_folio(page_folio(page), offset, len);
> +}
> EXPORT_SYMBOL_GPL(iomap_invalidatepage);
>
> #ifdef CONFIG_MIGRATION
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 63f4ea4dac9b..91de58ca09fc 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -225,6 +225,7 @@ void iomap_readahead(struct readahead_control *, const struct iomap_ops *ops);
> int iomap_is_partially_uptodate(struct page *page, unsigned long from,
> unsigned long count);
> int iomap_releasepage(struct page *page, gfp_t gfp_mask);
> +void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len);
> void iomap_invalidatepage(struct page *page, unsigned int offset,
> unsigned int len);
> #ifdef CONFIG_MIGRATION
> --
> 2.33.0
>
On Tue, Nov 02, 2021 at 04:33:39PM -0600, Jens Axboe wrote:
> On 11/2/21 4:24 PM, Darrick J. Wong wrote:
> > On Tue, Nov 02, 2021 at 08:24:14PM +0000, Matthew Wilcox wrote:
> >> On Tue, Nov 02, 2021 at 12:13:22AM -0700, Christoph Hellwig wrote:
> >>> On Mon, Nov 01, 2021 at 08:39:11PM +0000, Matthew Wilcox (Oracle) wrote:
> >>>> +static inline
> >>>> +void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
> >>>
> >>> Please fix the weird prototype formatting here.
> >>
> >> I dunno, it looks weirder this way:
> >>
> >> -static inline
> >> -void bio_first_folio(struct folio_iter *fi, struct bio *bio, int i)
> >> +static inline void bio_first_folio(struct folio_iter *fi, struct bio *bio,
> >> + int i)
> >>
> >> Anyway, I've made that change, but I still prefer it the way I had it.
> >
> > I /think/ Christoph meant:
> >
> > static inline void
> > bio_first_folio(...)
> >
> > Though the form that you've changed it to is also fine.
>
> I won't speak for Christoph, but basically everything else in block
> follows the:
>
> static inline void bio_first_folio(struct folio_iter *fi, struct bio *bio,
> int i)
> {
> }
>
> format, and this one should as well.
Durrr, /me forgot he was looking at block patches, not fs/iomap/. :(
Sorry for the noise.
--D
> --
> Jens Axboe
>
On Tue, Nov 02, 2021 at 12:14:35AM -0700, Christoph Hellwig wrote:
> On Mon, Nov 01, 2021 at 08:39:15PM +0000, Matthew Wilcox (Oracle) wrote:
> > This is an address_space operation, so its argument must remain as a
> > struct page, but we can use a folio internally.
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
>
> Looks good,
>
> Reviewed-by: Christoph Hellwig <[email protected]>
This seems reasonable to me too.
Even if my MTA saw "This is an ad" and spat it out. ;)
That has now been fixed, so
Reviewed-by: Darrick J. Wong <[email protected]>
--D
On Mon, Nov 01, 2021 at 08:39:23PM +0000, Matthew Wilcox (Oracle) wrote:
> These functions still only work in PAGE_SIZE chunks, but there are
> fewer conversions from tail to head pages as a result of this patch.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> ---
> fs/iomap/buffered-io.c | 67 ++++++++++++++++++++++--------------------
> 1 file changed, 35 insertions(+), 32 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index b55d947867b1..6df8fdbb1951 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -539,9 +539,8 @@ static int iomap_read_folio_sync(loff_t block_start, struct folio *folio,
> }
>
> static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
> - unsigned len, struct page *page)
> + size_t len, struct folio *folio)
> {
> - struct folio *folio = page_folio(page);
> const struct iomap *srcmap = iomap_iter_srcmap(iter);
> struct iomap_page *iop = iomap_page_create(iter->inode, folio);
> loff_t block_size = i_blocksize(iter->inode);
> @@ -583,9 +582,8 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
> }
>
> static int iomap_write_begin_inline(const struct iomap_iter *iter,
> - struct page *page)
> + struct folio *folio)
> {
> - struct folio *folio = page_folio(page);
> int ret;
>
> /* needs more work for the tailpacking case; disable for now */
> @@ -598,11 +596,13 @@ static int iomap_write_begin_inline(const struct iomap_iter *iter,
> }
>
> static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
> - unsigned len, struct page **pagep)
> + size_t len, struct folio **foliop)
> {
> const struct iomap_page_ops *page_ops = iter->iomap.page_ops;
> const struct iomap *srcmap = iomap_iter_srcmap(iter);
> + struct folio *folio;
> struct page *page;
> + unsigned fgp = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE | FGP_NOFS;
> int status = 0;
>
> BUG_ON(pos + len > iter->iomap.offset + iter->iomap.length);
> @@ -618,29 +618,30 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
> return status;
> }
>
> - page = grab_cache_page_write_begin(iter->inode->i_mapping,
> - pos >> PAGE_SHIFT, AOP_FLAG_NOFS);
> - if (!page) {
> + folio = __filemap_get_folio(iter->inode->i_mapping, pos >> PAGE_SHIFT,
> + fgp, mapping_gfp_mask(iter->inode->i_mapping));
> + if (!folio) {
> status = -ENOMEM;
> goto out_no_page;
> }
>
> + page = folio_file_page(folio, pos >> PAGE_SHIFT);
Isn't this only needed in the BUFFER_HEAD case?
--D
> if (srcmap->type == IOMAP_INLINE)
> - status = iomap_write_begin_inline(iter, page);
> + status = iomap_write_begin_inline(iter, folio);
> else if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
> status = __block_write_begin_int(page, pos, len, NULL, srcmap);
> else
> - status = __iomap_write_begin(iter, pos, len, page);
> + status = __iomap_write_begin(iter, pos, len, folio);
>
> if (unlikely(status))
> goto out_unlock;
>
> - *pagep = page;
> + *foliop = folio;
> return 0;
>
> out_unlock:
> - unlock_page(page);
> - put_page(page);
> + folio_unlock(folio);
> + folio_put(folio);
> iomap_write_failed(iter->inode, pos, len);
>
> out_no_page:
> @@ -650,11 +651,10 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
> }
>
> static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
> - size_t copied, struct page *page)
> + size_t copied, struct folio *folio)
> {
> - struct folio *folio = page_folio(page);
> struct iomap_page *iop = to_iomap_page(folio);
> - flush_dcache_page(page);
> + flush_dcache_folio(folio);
>
> /*
> * The blocks that were entirely written will now be uptodate, so we
> @@ -667,10 +667,10 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
> * non-uptodate page as a zero-length write, and force the caller to
> * redo the whole thing.
> */
> - if (unlikely(copied < len && !PageUptodate(page)))
> + if (unlikely(copied < len && !folio_test_uptodate(folio)))
> return 0;
> iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len);
> - __set_page_dirty_nobuffers(page);
> + filemap_dirty_folio(inode->i_mapping, folio);
> return copied;
> }
>
> @@ -694,8 +694,9 @@ static size_t iomap_write_end_inline(const struct iomap_iter *iter,
>
> /* Returns the number of bytes copied. May be 0. Cannot be an errno. */
> static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
> - size_t copied, struct page *page)
> + size_t copied, struct folio *folio)
> {
> + struct page *page = folio_file_page(folio, pos >> PAGE_SHIFT);
> const struct iomap_page_ops *page_ops = iter->iomap.page_ops;
> const struct iomap *srcmap = iomap_iter_srcmap(iter);
> loff_t old_size = iter->inode->i_size;
> @@ -707,7 +708,7 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
> ret = block_write_end(NULL, iter->inode->i_mapping, pos, len,
> copied, page, NULL);
> } else {
> - ret = __iomap_write_end(iter->inode, pos, len, copied, page);
> + ret = __iomap_write_end(iter->inode, pos, len, copied, folio);
> }
>
> /*
> @@ -719,13 +720,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
> i_size_write(iter->inode, pos + ret);
> iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
> }
> - unlock_page(page);
> + folio_unlock(folio);
>
> if (old_size < pos)
> pagecache_isize_extended(iter->inode, old_size, pos);
> if (page_ops && page_ops->page_done)
> page_ops->page_done(iter->inode, pos, ret, page);
> - put_page(page);
> + folio_put(folio);
>
> if (ret < len)
> iomap_write_failed(iter->inode, pos, len);
> @@ -740,6 +741,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
> long status = 0;
>
> do {
> + struct folio *folio;
> struct page *page;
> unsigned long offset; /* Offset into pagecache page */
> unsigned long bytes; /* Bytes to write to page */
> @@ -763,16 +765,17 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
> break;
> }
>
> - status = iomap_write_begin(iter, pos, bytes, &page);
> + status = iomap_write_begin(iter, pos, bytes, &folio);
> if (unlikely(status))
> break;
>
> + page = folio_file_page(folio, pos >> PAGE_SHIFT);
> if (mapping_writably_mapped(iter->inode->i_mapping))
> flush_dcache_page(page);
>
> copied = copy_page_from_iter_atomic(page, offset, bytes, i);
>
> - status = iomap_write_end(iter, pos, bytes, copied, page);
> + status = iomap_write_end(iter, pos, bytes, copied, folio);
>
> if (unlikely(copied != status))
> iov_iter_revert(i, copied - status);
> @@ -838,13 +841,13 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
> do {
> unsigned long offset = offset_in_page(pos);
> unsigned long bytes = min_t(loff_t, PAGE_SIZE - offset, length);
> - struct page *page;
> + struct folio *folio;
>
> - status = iomap_write_begin(iter, pos, bytes, &page);
> + status = iomap_write_begin(iter, pos, bytes, &folio);
> if (unlikely(status))
> return status;
>
> - status = iomap_write_end(iter, pos, bytes, bytes, page);
> + status = iomap_write_end(iter, pos, bytes, bytes, folio);
> if (WARN_ON_ONCE(status == 0))
> return -EIO;
>
> @@ -880,19 +883,19 @@ EXPORT_SYMBOL_GPL(iomap_file_unshare);
>
> static s64 __iomap_zero_iter(struct iomap_iter *iter, loff_t pos, u64 length)
> {
> - struct page *page;
> + struct folio *folio;
> int status;
> unsigned offset = offset_in_page(pos);
> unsigned bytes = min_t(u64, PAGE_SIZE - offset, length);
>
> - status = iomap_write_begin(iter, pos, bytes, &page);
> + status = iomap_write_begin(iter, pos, bytes, &folio);
> if (status)
> return status;
>
> - zero_user(page, offset, bytes);
> - mark_page_accessed(page);
> + zero_user(folio_file_page(folio, pos >> PAGE_SHIFT), offset, bytes);
> + folio_mark_accessed(folio);
>
> - return iomap_write_end(iter, pos, bytes, bytes, page);
> + return iomap_write_end(iter, pos, bytes, bytes, folio);
> }
>
> static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
> --
> 2.33.0
>
diff --git a/block/bio.c b/block/bio.c
index 15ab0d6d1c06..0e911c4fb9f2 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1033,6 +1033,28 @@ int bio_add_page(struct bio *bio, struct page *page, } EXPORT_SYMBOL(bio_add_page);
+/**
+ * bio_add_folio - Attempt to add part of a folio to a bio.
+ * @bio: BIO to add to.
+ * @folio: Folio to add.
+ * @len: How many bytes from the folio to add.
+ * @off: First byte in this folio to add.
+ *
+ * Filesystems that use folios can call this function instead of
+calling
+ * bio_add_page() for each page in the folio. If @off is bigger than
+ * PAGE_SIZE, this function can create a bio_vec that starts in a page
+ * after the bv_page. BIOs do not support folios that are 4GiB or larger.
+ *
+ * Return: Whether the addition was successful.
+ */
+bool bio_add_folio(struct bio *bio, struct folio *folio, size_t len,
+ size_t off)
+{
+ if (len > UINT_MAX || off > UINT_MAX)
+ return 0;
+ return bio_add_page(bio, &folio->page, len, off) > 0; }
+
Newline.
On Wed, Nov 03, 2021 at 01:25:57AM +0000, wangjianjian (C) wrote:
> diff --git a/block/bio.c b/block/bio.c
> index 15ab0d6d1c06..0e911c4fb9f2 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1033,6 +1033,28 @@ int bio_add_page(struct bio *bio, struct page *page, } EXPORT_SYMBOL(bio_add_page);
>
> +/**
> + * bio_add_folio - Attempt to add part of a folio to a bio.
> + * @bio: BIO to add to.
> + * @folio: Folio to add.
> + * @len: How many bytes from the folio to add.
> + * @off: First byte in this folio to add.
> + *
> + * Filesystems that use folios can call this function instead of
> +calling
> + * bio_add_page() for each page in the folio. If @off is bigger than
> + * PAGE_SIZE, this function can create a bio_vec that starts in a page
> + * after the bv_page. BIOs do not support folios that are 4GiB or larger.
> + *
> + * Return: Whether the addition was successful.
> + */
> +bool bio_add_folio(struct bio *bio, struct folio *folio, size_t len,
> + size_t off)
> +{
> + if (len > UINT_MAX || off > UINT_MAX)
> + return 0;
> + return bio_add_page(bio, &folio->page, len, off) > 0; }
> +
>
>
> Newline.
I think it's your mail system that's mangled it. Here's how it looked
to the rest of the world:
https://lore.kernel.org/linux-xfs/[email protected]/
On Tue, Nov 02, 2021 at 04:22:15PM -0700, Darrick J. Wong wrote:
> > + page = folio_file_page(folio, pos >> PAGE_SHIFT);
>
> Isn't this only needed in the BUFFER_HEAD case?
Good catch. Want me to fold this in?
+++ b/fs/iomap/buffered-io.c
@@ -608,7 +608,6 @@ static int iomap_write_begin(const struct iomap_iter *iter,
loff_t pos,
const struct iomap_page_ops *page_ops = iter->iomap.page_ops;
const struct iomap *srcmap = iomap_iter_srcmap(iter);
struct folio *folio;
- struct page *page;
unsigned fgp = FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE | FGP_NOFS;
int status = 0;
@@ -632,12 +631,12 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
goto out_no_page;
}
- page = folio_file_page(folio, pos >> PAGE_SHIFT);
if (srcmap->type == IOMAP_INLINE)
status = iomap_write_begin_inline(iter, folio);
- else if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
+ else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
+ struct page *page = folio_file_page(folio, pos >> PAGE_SHIFT);
status = __block_write_begin_int(page, pos, len, NULL, srcmap);
- else
+ } else
status = __iomap_write_begin(iter, pos, len, folio);
if (unlikely(status))
On Wed, Nov 03, 2021 at 03:15:13AM +0000, Matthew Wilcox wrote:
> On Tue, Nov 02, 2021 at 04:22:15PM -0700, Darrick J. Wong wrote:
> > > + page = folio_file_page(folio, pos >> PAGE_SHIFT);
> >
> > Isn't this only needed in the BUFFER_HEAD case?
>
> Good catch. Want me to fold this in?
>
> @@ -632,12 +631,12 @@ static int iomap_write_begin(const struct iomap_iter *iter, loff_t pos,
> goto out_no_page;
> }
>
> - page = folio_file_page(folio, pos >> PAGE_SHIFT);
> if (srcmap->type == IOMAP_INLINE)
> status = iomap_write_begin_inline(iter, folio);
> - else if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
> + else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
> + struct page *page = folio_file_page(folio, pos >> PAGE_SHIFT);
> status = __block_write_begin_int(page, pos, len, NULL, srcmap);
On second thoughts, this is silly. __block_write_begin_int() doesn't
want the precise page (because it constructs buffer_heads and attaches
them to the passed-in page). I should just pass &folio->page here.
And __block_write_begin_int() should be converted to take a folio
at some point.
On Mon, Nov 01, 2021 at 08:39:25PM +0000, Matthew Wilcox (Oracle) wrote:
> XFS has the only implementation of ->discard_page today, so convert it
> to use folios in the same patch as converting the API.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
LGTM
Reviewed-by: Darrick J. Wong <[email protected]>
--D
> ---
> fs/iomap/buffered-io.c | 4 ++--
> fs/xfs/xfs_aops.c | 24 ++++++++++++------------
> include/linux/iomap.h | 2 +-
> 3 files changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 6862487f4067..c50ae76835ca 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1349,8 +1349,8 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> * won't be affected by I/O completion and we must unlock it
> * now.
> */
> - if (wpc->ops->discard_page)
> - wpc->ops->discard_page(page, file_offset);
> + if (wpc->ops->discard_folio)
> + wpc->ops->discard_folio(page_folio(page), file_offset);
> if (!count) {
> ClearPageUptodate(page);
> unlock_page(page);
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 34fc6148032a..c6c4d07d0d26 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -428,37 +428,37 @@ xfs_prepare_ioend(
> * see a ENOSPC in writeback).
> */
> static void
> -xfs_discard_page(
> - struct page *page,
> - loff_t fileoff)
> +xfs_discard_folio(
> + struct folio *folio,
> + loff_t pos)
> {
> - struct inode *inode = page->mapping->host;
> + struct inode *inode = folio->mapping->host;
> struct xfs_inode *ip = XFS_I(inode);
> struct xfs_mount *mp = ip->i_mount;
> - unsigned int pageoff = offset_in_page(fileoff);
> - xfs_fileoff_t start_fsb = XFS_B_TO_FSBT(mp, fileoff);
> - xfs_fileoff_t pageoff_fsb = XFS_B_TO_FSBT(mp, pageoff);
> + size_t offset = offset_in_folio(folio, pos);
> + xfs_fileoff_t start_fsb = XFS_B_TO_FSBT(mp, pos);
> + xfs_fileoff_t pageoff_fsb = XFS_B_TO_FSBT(mp, offset);
> int error;
>
> if (xfs_is_shutdown(mp))
> goto out_invalidate;
>
> xfs_alert_ratelimited(mp,
> - "page discard on page "PTR_FMT", inode 0x%llx, offset %llu.",
> - page, ip->i_ino, fileoff);
> + "page discard on page "PTR_FMT", inode 0x%llx, pos %llu.",
> + folio, ip->i_ino, pos);
>
> error = xfs_bmap_punch_delalloc_range(ip, start_fsb,
> - i_blocks_per_page(inode, page) - pageoff_fsb);
> + i_blocks_per_folio(inode, folio) - pageoff_fsb);
> if (error && !xfs_is_shutdown(mp))
> xfs_alert(mp, "page discard unable to remove delalloc mapping.");
> out_invalidate:
> - iomap_invalidatepage(page, pageoff, PAGE_SIZE - pageoff);
> + iomap_invalidate_folio(folio, offset, folio_size(folio) - offset);
> }
>
> static const struct iomap_writeback_ops xfs_writeback_ops = {
> .map_blocks = xfs_map_blocks,
> .prepare_ioend = xfs_prepare_ioend,
> - .discard_page = xfs_discard_page,
> + .discard_folio = xfs_discard_folio,
> };
>
> STATIC int
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 91de58ca09fc..1a161314d7e4 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -285,7 +285,7 @@ struct iomap_writeback_ops {
> * Optional, allows the file system to discard state on a page where
> * we failed to submit any I/O.
> */
> - void (*discard_page)(struct page *page, loff_t fileoff);
> + void (*discard_folio)(struct folio *folio, loff_t pos);
> };
>
> struct iomap_writepage_ctx {
> --
> 2.33.0
>
On Tue, Nov 02, 2021 at 08:28:02PM +0000, Matthew Wilcox wrote:
> On Tue, Nov 02, 2021 at 12:26:42AM -0700, Christoph Hellwig wrote:
> > Looking at the code not part of the context this looks fine. But I
> > really wonder if this (and also the blocks change above) would be
> > better off being split into separate, clearly documented patches.
>
> How do these three patches look? I retained your R-b on all three since
> I figured the one you offered below was good for all of them.
Sounds good, and the patches looks good. Minor nitpicks below:
> Rename end_offset to end_pos and file_offset to pos to match the
> rest of the file. Simplify the loop by calculating nblocks
> up front instead of each time around the loop.
Might be worth mentioning why it changes the types from u64 to loff_t.
> /*
> - * Walk through the page to find areas to write back. If we run off the
> - * end of the current map or find the current map invalid, grab a new
> - * one.
> + * Walk through the folio to find areas to write back. If we
> + * run off the end of the current map or find the current map
> + * invalid, grab a new one.
No real need for reflowing the comment, it still fits just fine even
with the folio change.
> Rename end_offset to end_pos and offset_into_page to poff to match the
> rest of the file. Simplify the handling of the last page straddling
> i_size.
... by doing the EOF check purely based on the byte granularity i_size
instead of converting to a pgoff prematurely.
> + isize = i_size_read(inode);
> + end_pos = page_offset(page) + PAGE_SIZE;
> + if (end_pos - 1 >= isize) {
Wouldn't this check be more obvious as:
if (end_pos > i_size) {
On Tue, Nov 02, 2021 at 08:28:02PM +0000, Matthew Wilcox wrote:
> On Tue, Nov 02, 2021 at 12:26:42AM -0700, Christoph Hellwig wrote:
> > Looking at the code not part of the context this looks fine. But I
> > really wonder if this (and also the blocks change above) would be
> > better off being split into separate, clearly documented patches.
>
> How do these three patches look? I retained your R-b on all three since
> I figured the one you offered below was good for all of them.
(TLDR: I have two RVB and a question about the third patch, please
scroll down...)
> From ab7cace8f325ca5cc1b1e62e6a8498c84738bc10 Mon Sep 17 00:00:00 2001
> From: "Matthew Wilcox (Oracle)" <[email protected]>
> Date: Tue, 2 Nov 2021 10:51:55 -0400
> Subject: [PATCH 1/3] iomap: Simplify iomap_writepage_map()
>
> Rename end_offset to end_pos and file_offset to pos to match the
> rest of the file. Simplify the loop by calculating nblocks
> up front instead of each time around the loop.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> Reviewed-by: Christoph Hellwig <[email protected]>
> ---
> fs/iomap/buffered-io.c | 21 ++++++++++-----------
> 1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 8f47879f9f05..e32e3cb2cf86 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1296,37 +1296,36 @@ iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
> static int
> iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> struct writeback_control *wbc, struct inode *inode,
> - struct page *page, u64 end_offset)
> + struct page *page, loff_t end_pos)
> {
> struct folio *folio = page_folio(page);
> struct iomap_page *iop = iomap_page_create(inode, folio);
> struct iomap_ioend *ioend, *next;
> unsigned len = i_blocksize(inode);
> - u64 file_offset; /* file offset of page */
> + unsigned nblocks = i_blocks_per_folio(inode, folio);
> + loff_t pos = folio_pos(folio);
> int error = 0, count = 0, i;
> LIST_HEAD(submit_list);
>
> WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0);
>
> /*
> - * Walk through the page to find areas to write back. If we run off the
> - * end of the current map or find the current map invalid, grab a new
> - * one.
> + * Walk through the folio to find areas to write back. If we
> + * run off the end of the current map or find the current map
> + * invalid, grab a new one.
> */
> - for (i = 0, file_offset = page_offset(page);
> - i < (PAGE_SIZE >> inode->i_blkbits) && file_offset < end_offset;
> - i++, file_offset += len) {
> + for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) {
> if (iop && !test_bit(i, iop->uptodate))
> continue;
>
> - error = wpc->ops->map_blocks(wpc, inode, file_offset);
> + error = wpc->ops->map_blocks(wpc, inode, pos);
> if (error)
> break;
> if (WARN_ON_ONCE(wpc->iomap.type == IOMAP_INLINE))
> continue;
> if (wpc->iomap.type == IOMAP_HOLE)
> continue;
> - iomap_add_to_ioend(inode, file_offset, page, iop, wpc, wbc,
> + iomap_add_to_ioend(inode, pos, page, iop, wpc, wbc,
> &submit_list);
> count++;
> }
> @@ -1350,7 +1349,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> * now.
> */
> if (wpc->ops->discard_folio)
> - wpc->ops->discard_folio(page_folio(page), file_offset);
> + wpc->ops->discard_folio(folio, pos);
/me wonders why this wouldn't have been done in whichever patch added
folio as a local variable, but fmeh, the end result is the same:
Pretty straightforward conversion,
Reviewed-by: Darrick J. Wong <[email protected]>
(onto the next patch)
> if (!count) {
> ClearPageUptodate(page);
> unlock_page(page);
> --
> 2.33.0
>
>
> From 07c994353e357c3b4252595a80b86e8565deb09c Mon Sep 17 00:00:00 2001
> From: "Matthew Wilcox (Oracle)" <[email protected]>
> Date: Tue, 2 Nov 2021 11:41:16 -0400
> Subject: [PATCH 2/3] iomap: Simplify iomap_do_writepage()
>
> Rename end_offset to end_pos and offset_into_page to poff to match the
> rest of the file. Simplify the handling of the last page straddling
> i_size.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> Reviewed-by: Christoph Hellwig <[email protected]>
> ---
> fs/iomap/buffered-io.c | 23 ++++++++++-------------
> 1 file changed, 10 insertions(+), 13 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index e32e3cb2cf86..4f4f33849417 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1397,9 +1397,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> {
> struct iomap_writepage_ctx *wpc = data;
> struct inode *inode = page->mapping->host;
> - pgoff_t end_index;
> - u64 end_offset;
> - loff_t offset;
> + loff_t end_pos, isize;
>
> trace_iomap_writepage(inode, page_offset(page), PAGE_SIZE);
>
> @@ -1430,11 +1428,9 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> * | desired writeback range | see else |
> * ---------------------------------^------------------|
> */
> - offset = i_size_read(inode);
> - end_index = offset >> PAGE_SHIFT;
> - if (page->index < end_index)
> - end_offset = (loff_t)(page->index + 1) << PAGE_SHIFT;
> - else {
> + isize = i_size_read(inode);
> + end_pos = page_offset(page) + PAGE_SIZE;
> + if (end_pos - 1 >= isize) {
This old code was good at twisting my brain in knots, thanks for
cleaning this up.
Reviewed-by: Darrick J. Wong <[email protected]>
(onto the third patch)
> /*
> * Check whether the page to write out is beyond or straddles
> * i_size or not.
> @@ -1446,7 +1442,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> * | | Straddles |
> * ---------------------------------^-----------|--------|
> */
> - unsigned offset_into_page = offset & (PAGE_SIZE - 1);
> + size_t poff = offset_in_page(isize);
> + pgoff_t end_index = isize >> PAGE_SHIFT;
>
> /*
> * Skip the page if it's fully outside i_size, e.g. due to a
> @@ -1466,7 +1463,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> * offset is just equal to the EOF.
> */
> if (page->index > end_index ||
> - (page->index == end_index && offset_into_page == 0))
> + (page->index == end_index && poff == 0))
> goto redirty;
>
> /*
> @@ -1477,13 +1474,13 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> * memory is zeroed when mapped, and writes to that region are
> * not written out to the file."
> */
> - zero_user_segment(page, offset_into_page, PAGE_SIZE);
> + zero_user_segment(page, poff, PAGE_SIZE);
>
> /* Adjust the end_offset to the end of file */
> - end_offset = offset;
> + end_pos = isize;
> }
>
> - return iomap_writepage_map(wpc, wbc, inode, page, end_offset);
> + return iomap_writepage_map(wpc, wbc, inode, page, end_pos);
>
> redirty:
> redirty_page_for_writepage(wbc, page);
> --
> 2.33.0
>
>
> From d5412657a503ae27efb5770fbc1c5c980180c9c4 Mon Sep 17 00:00:00 2001
> From: "Matthew Wilcox (Oracle)" <[email protected]>
> Date: Tue, 2 Nov 2021 12:45:12 -0400
> Subject: [PATCH 3/3] iomap: Convert iomap_add_to_ioend to take a folio
>
> We still iterate one block at a time, but now we call compound_head()
> less often.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> Reviewed-by: Christoph Hellwig <[email protected]>
> ---
> fs/iomap/buffered-io.c | 70 ++++++++++++++++++++----------------------
> 1 file changed, 34 insertions(+), 36 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 4f4f33849417..8908368abd49 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1252,29 +1252,29 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset,
> * first; otherwise finish off the current ioend and start another.
> */
> static void
> -iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
> +iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio,
> struct iomap_page *iop, struct iomap_writepage_ctx *wpc,
> struct writeback_control *wbc, struct list_head *iolist)
> {
> - sector_t sector = iomap_sector(&wpc->iomap, offset);
> + sector_t sector = iomap_sector(&wpc->iomap, pos);
> unsigned len = i_blocksize(inode);
> - unsigned poff = offset & (PAGE_SIZE - 1);
> + size_t poff = offset_in_folio(folio, pos);
>
> - if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, offset, sector)) {
> + if (!wpc->ioend || !iomap_can_add_to_ioend(wpc, pos, sector)) {
> if (wpc->ioend)
> list_add(&wpc->ioend->io_list, iolist);
> - wpc->ioend = iomap_alloc_ioend(inode, wpc, offset, sector, wbc);
> + wpc->ioend = iomap_alloc_ioend(inode, wpc, pos, sector, wbc);
> }
>
> - if (bio_add_page(wpc->ioend->io_bio, page, len, poff) != len) {
> + if (!bio_add_folio(wpc->ioend->io_bio, folio, len, poff)) {
> wpc->ioend->io_bio = iomap_chain_bio(wpc->ioend->io_bio);
> - __bio_add_page(wpc->ioend->io_bio, page, len, poff);
> + bio_add_folio(wpc->ioend->io_bio, folio, len, poff);
> }
>
> if (iop)
> atomic_add(len, &iop->write_bytes_pending);
> wpc->ioend->io_size += len;
> - wbc_account_cgroup_owner(wbc, page, len);
> + wbc_account_cgroup_owner(wbc, &folio->page, len);
> }
>
> /*
> @@ -1296,9 +1296,8 @@ iomap_add_to_ioend(struct inode *inode, loff_t offset, struct page *page,
> static int
> iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> struct writeback_control *wbc, struct inode *inode,
> - struct page *page, loff_t end_pos)
> + struct folio *folio, loff_t end_pos)
> {
> - struct folio *folio = page_folio(page);
> struct iomap_page *iop = iomap_page_create(inode, folio);
> struct iomap_ioend *ioend, *next;
> unsigned len = i_blocksize(inode);
> @@ -1325,15 +1324,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> continue;
> if (wpc->iomap.type == IOMAP_HOLE)
> continue;
> - iomap_add_to_ioend(inode, pos, page, iop, wpc, wbc,
> + iomap_add_to_ioend(inode, pos, folio, iop, wpc, wbc,
> &submit_list);
> count++;
> }
>
> WARN_ON_ONCE(!wpc->ioend && !list_empty(&submit_list));
> - WARN_ON_ONCE(!PageLocked(page));
> - WARN_ON_ONCE(PageWriteback(page));
> - WARN_ON_ONCE(PageDirty(page));
> + WARN_ON_ONCE(!folio_test_locked(folio));
> + WARN_ON_ONCE(folio_test_writeback(folio));
> + WARN_ON_ONCE(folio_test_dirty(folio));
>
> /*
> * We cannot cancel the ioend directly here on error. We may have
> @@ -1351,14 +1350,14 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> if (wpc->ops->discard_folio)
> wpc->ops->discard_folio(folio, pos);
> if (!count) {
> - ClearPageUptodate(page);
> - unlock_page(page);
> + folio_clear_uptodate(folio);
> + folio_unlock(folio);
> goto done;
> }
> }
>
> - set_page_writeback(page);
> - unlock_page(page);
> + folio_start_writeback(folio);
> + folio_unlock(folio);
>
> /*
> * Preserve the original error if there was one; catch
> @@ -1379,9 +1378,9 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> * with a partial page truncate on a sub-page block sized filesystem.
> */
> if (!count)
> - end_page_writeback(page);
> + folio_end_writeback(folio);
> done:
> - mapping_set_error(page->mapping, error);
> + mapping_set_error(folio->mapping, error);
> return error;
> }
>
> @@ -1395,14 +1394,15 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc,
> static int
> iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> {
> + struct folio *folio = page_folio(page);
> struct iomap_writepage_ctx *wpc = data;
> - struct inode *inode = page->mapping->host;
> + struct inode *inode = folio->mapping->host;
> loff_t end_pos, isize;
>
> - trace_iomap_writepage(inode, page_offset(page), PAGE_SIZE);
> + trace_iomap_writepage(inode, folio_pos(folio), folio_size(folio));
>
> /*
> - * Refuse to write the page out if we're called from reclaim context.
> + * Refuse to write the folio out if we're called from reclaim context.
> *
> * This avoids stack overflows when called from deeply used stacks in
> * random callers for direct reclaim or memcg reclaim. We explicitly
> @@ -1416,10 +1416,10 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> goto redirty;
>
> /*
> - * Is this page beyond the end of the file?
> + * Is this folio beyond the end of the file?
> *
> - * The page index is less than the end_index, adjust the end_offset
> - * to the highest offset that this page should represent.
> + * The folio index is less than the end_index, adjust the end_pos
> + * to the highest offset that this folio should represent.
> * -----------------------------------------------------
> * | file mapping | <EOF> |
> * -----------------------------------------------------
> @@ -1429,7 +1429,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> * ---------------------------------^------------------|
> */
> isize = i_size_read(inode);
> - end_pos = page_offset(page) + PAGE_SIZE;
> + end_pos = folio_pos(folio) + folio_size(folio);
> if (end_pos - 1 >= isize) {
> /*
> * Check whether the page to write out is beyond or straddles
> @@ -1442,7 +1442,7 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> * | | Straddles |
> * ---------------------------------^-----------|--------|
> */
> - size_t poff = offset_in_page(isize);
> + size_t poff = offset_in_folio(folio, isize);
> pgoff_t end_index = isize >> PAGE_SHIFT;
>
> /*
> @@ -1462,8 +1462,8 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> * checking if the page is totally beyond i_size or if its
> * offset is just equal to the EOF.
> */
> - if (page->index > end_index ||
> - (page->index == end_index && poff == 0))
> + if (folio->index > end_index ||
> + (folio->index == end_index && poff == 0))
> goto redirty;
>
> /*
> @@ -1474,17 +1474,15 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> * memory is zeroed when mapped, and writes to that region are
> * not written out to the file."
> */
> - zero_user_segment(page, poff, PAGE_SIZE);
> -
> - /* Adjust the end_offset to the end of file */
> + zero_user_segment(&folio->page, poff, folio_size(folio));
Question: is &folio->page != page here? I guess the idea is that we
have a (potentially multi-page) folio straddling i_size, and we need to
zero everything in the whole folio after i_size. But then why not pass
the whole folio?
--D
> end_pos = isize;
> }
>
> - return iomap_writepage_map(wpc, wbc, inode, page, end_pos);
> + return iomap_writepage_map(wpc, wbc, inode, folio, end_pos);
>
> redirty:
> - redirty_page_for_writepage(wbc, page);
> - unlock_page(page);
> + folio_redirty_for_writepage(wbc, folio);
> + folio_unlock(folio);
> return 0;
> }
>
> --
> 2.33.0
>
On Mon, Nov 01, 2021 at 08:39:27PM +0000, Matthew Wilcox (Oracle) wrote:
> The arguments are still pages for now, but we can use folios internally
> and cut out a lot of calls to compound_head().
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
--D
> ---
> fs/iomap/buffered-io.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 2436933dfe42..3b93fdfedb72 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -493,19 +493,21 @@ int
> iomap_migrate_page(struct address_space *mapping, struct page *newpage,
> struct page *page, enum migrate_mode mode)
> {
> + struct folio *folio = page_folio(page);
> + struct folio *newfolio = page_folio(newpage);
> int ret;
>
> - ret = migrate_page_move_mapping(mapping, newpage, page, 0);
> + ret = folio_migrate_mapping(mapping, newfolio, folio, 0);
> if (ret != MIGRATEPAGE_SUCCESS)
> return ret;
>
> - if (page_has_private(page))
> - attach_page_private(newpage, detach_page_private(page));
> + if (folio_test_private(folio))
> + folio_attach_private(newfolio, folio_detach_private(folio));
>
> if (mode != MIGRATE_SYNC_NO_COPY)
> - migrate_page_copy(newpage, page);
> + folio_migrate_copy(newfolio, folio);
> else
> - migrate_page_states(newpage, page);
> + folio_migrate_flags(newfolio, folio);
> return MIGRATEPAGE_SUCCESS;
> }
> EXPORT_SYMBOL_GPL(iomap_migrate_page);
> --
> 2.33.0
>
On Mon, Nov 01, 2021 at 08:39:28PM +0000, Matthew Wilcox (Oracle) wrote:
> If we're punching a hole in a multi-page folio, we need to remove the
> per-folio iomap data as the folio is about to be split and each page will
> need its own. If a dirty folio is only partially-uptodate, the iomap
> data contains the information about which blocks cannot be written back,
> so assert that a dirty folio is fully uptodate.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Looks good to me,
Reviewed-by: Darrick J. Wong <[email protected]>
--D
> ---
> fs/iomap/buffered-io.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 3b93fdfedb72..9d7c91f9ec1d 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -470,13 +470,18 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len)
> trace_iomap_invalidatepage(folio->mapping->host, offset, len);
>
> /*
> - * If we're invalidating the entire page, clear the dirty state from it
> - * and release it to avoid unnecessary buildup of the LRU.
> + * If we're invalidating the entire folio, clear the dirty state
> + * from it and release it to avoid unnecessary buildup of the LRU.
> */
> if (offset == 0 && len == folio_size(folio)) {
> WARN_ON_ONCE(folio_test_writeback(folio));
> folio_cancel_dirty(folio);
> iomap_page_release(folio);
> + } else if (folio_test_multi(folio)) {
> + /* Must release the iop so the page can be split */
> + WARN_ON_ONCE(!folio_test_uptodate(folio) &&
> + folio_test_dirty(folio));
> + iomap_page_release(folio);
> }
> }
> EXPORT_SYMBOL_GPL(iomap_invalidate_folio);
> --
> 2.33.0
>
On Mon, Nov 01, 2021 at 08:39:29PM +0000, Matthew Wilcox (Oracle) wrote:
> Now that iomap has been converted, XFS is multi-page folio safe.
> Indicate to the VFS that it can now create multi-page folios for XFS.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Provisional
Reviewed-by: Darrick J. Wong <[email protected]>
...assuming you've run generic/521 and generic/522 (fsx) and generic/476
(fsstress) through the grinder for several days?
And just for laughs, could you run those three (for an hour or two) with
MKFS_OPTIONS='-m reflink=0,rmapbt=0 -d rtinherit=1 -r extsize=28k,rtdev=/dev/XXX'
just to see how well multipage folios deal with 4k blocks allocated in
chunks of 28k on the realtime device? Pretty please? :D
--D
> ---
> fs/xfs/xfs_icache.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index f2210d927481..804507c82455 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -87,6 +87,7 @@ xfs_inode_alloc(
> /* VFS doesn't initialise i_mode or i_state! */
> VFS_I(ip)->i_mode = 0;
> VFS_I(ip)->i_state = 0;
> + mapping_set_large_folios(VFS_I(ip)->i_mapping);
>
> XFS_STATS_INC(mp, vn_active);
> ASSERT(atomic_read(&ip->i_pincount) == 0);
> @@ -336,6 +337,7 @@ xfs_reinit_inode(
> inode->i_rdev = dev;
> inode->i_uid = uid;
> inode->i_gid = gid;
> + mapping_set_large_folios(inode->i_mapping);
> return error;
> }
>
> --
> 2.33.0
>
On Wed, Nov 03, 2021 at 08:54:50AM -0700, Christoph Hellwig wrote:
> > - * Walk through the page to find areas to write back. If we run off the
> > - * end of the current map or find the current map invalid, grab a new
> > - * one.
> > + * Walk through the folio to find areas to write back. If we
> > + * run off the end of the current map or find the current map
> > + * invalid, grab a new one.
>
> No real need for reflowing the comment, it still fits just fine even
> with the folio change.
Sure, but I don't like using column 79, unless it's better to. We're on
three lines anyway; may as well make better use of that third line.
> > + isize = i_size_read(inode);
> > + end_pos = page_offset(page) + PAGE_SIZE;
> > + if (end_pos - 1 >= isize) {
>
> Wouldn't this check be more obvious as:
>
> if (end_pos > i_size) {
I _think_ we restrict the maximum file size to 2^63 - 1 to avoid i_size
ever being negative. But that means that end_pos might be 2^63 (ie
LONG_MIN), so we need to subtract one from it to get the right answer.
Maybe worth a comment?
On Wed, Nov 03, 2021 at 09:00:57AM -0700, Darrick J. Wong wrote:
> > - wpc->ops->discard_folio(page_folio(page), file_offset);
> > + wpc->ops->discard_folio(folio, pos);
>
> /me wonders why this wouldn't have been done in whichever patch added
> folio as a local variable, but fmeh, the end result is the same:
Found it and fixed it.
> > @@ -1474,17 +1474,15 @@ iomap_do_writepage(struct page *page, struct writeback_control *wbc, void *data)
> > * memory is zeroed when mapped, and writes to that region are
> > * not written out to the file."
> > */
> > - zero_user_segment(page, poff, PAGE_SIZE);
> > -
> > - /* Adjust the end_offset to the end of file */
> > + zero_user_segment(&folio->page, poff, folio_size(folio));
>
> Question: is &folio->page != page here? I guess the idea is that we
> have a (potentially multi-page) folio straddling i_size, and we need to
> zero everything in the whole folio after i_size. But then why not pass
> the whole folio?
Ugh, thanks. You made me realise that zero_user_segments() is still
conditional on CONFIG_TRANSPARENT_HUGEPAGE. It's a relic of when I
was going to do all of this with THP; before I switched to the folio
mental model.
So now we're going to get folio_zero_segments(), folio_zero_segment()
and folio_zero_range().
On Thu, Nov 04, 2021 at 03:33:52AM +0000, Matthew Wilcox wrote:
> On Wed, Nov 03, 2021 at 08:54:50AM -0700, Christoph Hellwig wrote:
> > > - * Walk through the page to find areas to write back. If we run off the
> > > - * end of the current map or find the current map invalid, grab a new
> > > - * one.
> > > + * Walk through the folio to find areas to write back. If we
> > > + * run off the end of the current map or find the current map
> > > + * invalid, grab a new one.
> >
> > No real need for reflowing the comment, it still fits just fine even
> > with the folio change.
>
> Sure, but I don't like using column 79, unless it's better to. We're on
> three lines anyway; may as well make better use of that third line.
Ok, tht's a little weird but a personal preference. That being said
reflowing the whole comment just for that seems odd.
>
> > > + isize = i_size_read(inode);
> > > + end_pos = page_offset(page) + PAGE_SIZE;
> > > + if (end_pos - 1 >= isize) {
> >
> > Wouldn't this check be more obvious as:
> >
> > if (end_pos > i_size) {
>
> I _think_ we restrict the maximum file size to 2^63 - 1 to avoid i_size
> ever being negative. But that means that end_pos might be 2^63 (ie
> LONG_MIN), so we need to subtract one from it to get the right answer.
> Maybe worth a comment?
Yes, please.
On Thu, Nov 04, 2021 at 01:38:48AM -0700, Christoph Hellwig wrote:
> > I _think_ we restrict the maximum file size to 2^63 - 1 to avoid i_size
> > ever being negative. But that means that end_pos might be 2^63 (ie
> > LONG_MIN), so we need to subtract one from it to get the right answer.
> > Maybe worth a comment?
>
> Yes, please.
Or we should stick to the u64 type that the existing code uses to side
step that whole issue..
On Mon, Nov 01, 2021 at 08:39:27PM +0000, Matthew Wilcox (Oracle) wrote:
> +++ b/fs/iomap/buffered-io.c
> @@ -493,19 +493,21 @@ int
> iomap_migrate_page(struct address_space *mapping, struct page *newpage,
> struct page *page, enum migrate_mode mode)
> {
> + struct folio *folio = page_folio(page);
> + struct folio *newfolio = page_folio(newpage);
Re-reviewing this patch, and I don't like the naming. How about:
struct folio *src = page_folio(page);
struct folio *dest = page_folio(newpage);
... eventually flowing that renaming throughout the migration
implementations.
> int ret;
>
> - ret = migrate_page_move_mapping(mapping, newpage, page, 0);
> + ret = folio_migrate_mapping(mapping, newfolio, folio, 0);
> if (ret != MIGRATEPAGE_SUCCESS)
> return ret;
>
> - if (page_has_private(page))
> - attach_page_private(newpage, detach_page_private(page));
> + if (folio_test_private(folio))
> + folio_attach_private(newfolio, folio_detach_private(folio));
>
> if (mode != MIGRATE_SYNC_NO_COPY)
> - migrate_page_copy(newpage, page);
> + folio_migrate_copy(newfolio, folio);
> else
> - migrate_page_states(newpage, page);
> + folio_migrate_flags(newfolio, folio);
> return MIGRATEPAGE_SUCCESS;
> }
> EXPORT_SYMBOL_GPL(iomap_migrate_page);
> --
> 2.33.0
>