2023-03-22 13:51:37

by Pankaj Raghav

[permalink] [raw]
Subject: [RFC v2 0/5] remove page_endio()

It was decided to remove the page_endio() as per the previous RFC
discussion[1] of this series and move that functionality into the caller
itself. One of the side benefit of doing that is the callers have been
modified to directly work on folios as page_endio() already worked on
folios.

mpage changes were tested with a simple boot testing. zram and orangefs is
only build tested. No functional changes were introduced as a part of
this AFAIK.

Open questions:
- Willy pointed out that the calls to folio_set_error() and
folio_clear_uptodate() are not needed anymore in the read path when an
error happens[2]. I still don't understand 100% why they aren't needed
anymore as I see those functions are still called in iomap. It will be
good to put that rationale as a part of the commit message.

[1] https://lore.kernel.org/linux-mm/[email protected]/
[2] https://lore.kernel.org/linux-mm/ZBSH6Uq6IIXON%[email protected]/

Pankaj Raghav (5):
zram: remove zram_page_end_io function
orangefs: use folios in orangefs_readahead
mpage: split bi_end_io callback for reads and writes
mpage: use folios in bio end_io handler
filemap: remove page_endio()

drivers/block/zram/zram_drv.c | 13 +----------
fs/mpage.c | 44 ++++++++++++++++++++++++++++-------
fs/orangefs/inode.c | 9 +++----
include/linux/pagemap.h | 2 --
mm/filemap.c | 30 ------------------------
5 files changed, 42 insertions(+), 56 deletions(-)

--
2.34.1


2023-03-22 13:51:38

by Pankaj Raghav

[permalink] [raw]
Subject: [RFC v2 2/5] orangefs: use folios in orangefs_readahead

Convert orangefs_readahead() from using struct page to struct folio.
This conversion removes the call to page_endio() which is soon to be
removed, and simplifies the final page handling.

Signed-off-by: Pankaj Raghav <[email protected]>
---
fs/orangefs/inode.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index aefdf1d3be7c..9014bbcc8031 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -244,7 +244,7 @@ static void orangefs_readahead(struct readahead_control *rac)
struct iov_iter iter;
struct inode *inode = rac->mapping->host;
struct xarray *i_pages;
- struct page *page;
+ struct folio *folio;
loff_t new_start = readahead_pos(rac);
int ret;
size_t new_len = 0;
@@ -275,9 +275,10 @@ static void orangefs_readahead(struct readahead_control *rac)
ret = 0;

/* clean up. */
- while ((page = readahead_page(rac))) {
- page_endio(page, false, ret);
- put_page(page);
+ while ((folio = readahead_folio(rac))) {
+ if (!ret)
+ folio_mark_uptodate(folio);
+ folio_unlock(folio);
}
}

--
2.34.1

2023-03-22 13:52:25

by Pankaj Raghav

[permalink] [raw]
Subject: [RFC v2 3/5] mpage: split bi_end_io callback for reads and writes

Split the bi_end_io handler for reads and writes similar to other aops.
This is a prep patch before we convert end_io handlers to use folios.

Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Pankaj Raghav <[email protected]>
---
fs/mpage.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/fs/mpage.c b/fs/mpage.c
index 22b9de5ddd68..3a545bf0f184 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -43,14 +43,28 @@
* status of that page is hard. See end_buffer_async_read() for the details.
* There is no point in duplicating all that complexity.
*/
-static void mpage_end_io(struct bio *bio)
+static void mpage_read_end_io(struct bio *bio)
{
struct bio_vec *bv;
struct bvec_iter_all iter_all;

bio_for_each_segment_all(bv, bio, iter_all) {
struct page *page = bv->bv_page;
- page_endio(page, bio_op(bio),
+ page_endio(page, REQ_OP_READ,
+ blk_status_to_errno(bio->bi_status));
+ }
+
+ bio_put(bio);
+}
+
+static void mpage_write_end_io(struct bio *bio)
+{
+ struct bio_vec *bv;
+ struct bvec_iter_all iter_all;
+
+ bio_for_each_segment_all(bv, bio, iter_all) {
+ struct page *page = bv->bv_page;
+ page_endio(page, REQ_OP_WRITE,
blk_status_to_errno(bio->bi_status));
}

@@ -59,7 +73,11 @@ static void mpage_end_io(struct bio *bio)

static struct bio *mpage_bio_submit(struct bio *bio)
{
- bio->bi_end_io = mpage_end_io;
+ if (op_is_write(bio_op(bio)))
+ bio->bi_end_io = mpage_write_end_io;
+ else
+ bio->bi_end_io = mpage_read_end_io;
+
guard_bio_eod(bio);
submit_bio(bio);
return NULL;
--
2.34.1

2023-03-22 13:53:30

by Pankaj Raghav

[permalink] [raw]
Subject: [RFC v2 4/5] mpage: use folios in bio end_io handler

Use folios in the bio end_io handler. This conversion does the appropriate
handling on the folios in the respective end_io callback and removes the
call to page_endio(), which is soon to be removed.

Signed-off-by: Pankaj Raghav <[email protected]>
---
fs/mpage.c | 34 ++++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/fs/mpage.c b/fs/mpage.c
index 3a545bf0f184..103505551896 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -45,13 +45,15 @@
*/
static void mpage_read_end_io(struct bio *bio)
{
- struct bio_vec *bv;
- struct bvec_iter_all iter_all;
+ struct folio_iter fi;
+ int err = blk_status_to_errno(bio->bi_status);

- bio_for_each_segment_all(bv, bio, iter_all) {
- struct page *page = bv->bv_page;
- page_endio(page, REQ_OP_READ,
- blk_status_to_errno(bio->bi_status));
+ bio_for_each_folio_all(fi, bio) {
+ struct folio *folio = fi.folio;
+
+ if (!err)
+ folio_mark_uptodate(folio);
+ folio_unlock(folio);
}

bio_put(bio);
@@ -59,13 +61,21 @@ static void mpage_read_end_io(struct bio *bio)

static void mpage_write_end_io(struct bio *bio)
{
- struct bio_vec *bv;
- struct bvec_iter_all iter_all;
+ struct folio_iter fi;
+ int err = blk_status_to_errno(bio->bi_status);

- bio_for_each_segment_all(bv, bio, iter_all) {
- struct page *page = bv->bv_page;
- page_endio(page, REQ_OP_WRITE,
- blk_status_to_errno(bio->bi_status));
+ bio_for_each_folio_all(fi, bio) {
+ struct folio *folio = fi.folio;
+
+ if (err) {
+ struct address_space *mapping;
+
+ folio_set_error(folio);
+ mapping = folio_mapping(folio);
+ if (mapping)
+ mapping_set_error(mapping, err);
+ }
+ folio_end_writeback(folio);
}

bio_put(bio);
--
2.34.1

2023-03-22 13:58:01

by Pankaj Raghav

[permalink] [raw]
Subject: [RFC v2 5/5] filemap: remove page_endio()

page_endio() is not used anymore. Remove it.

Signed-off-by: Pankaj Raghav <[email protected]>
---
include/linux/pagemap.h | 2 --
mm/filemap.c | 30 ------------------------------
2 files changed, 32 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index fdcd595d2294..73ee6ead90dd 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1076,8 +1076,6 @@ int filemap_migrate_folio(struct address_space *mapping, struct folio *dst,
#else
#define filemap_migrate_folio NULL
#endif
-void page_endio(struct page *page, bool is_write, int err);
-
void folio_end_private_2(struct folio *folio);
void folio_wait_private_2(struct folio *folio);
int folio_wait_private_2_killable(struct folio *folio);
diff --git a/mm/filemap.c b/mm/filemap.c
index 6f3a7e53fccf..a770a207825d 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1625,36 +1625,6 @@ void folio_end_writeback(struct folio *folio)
}
EXPORT_SYMBOL(folio_end_writeback);

-/*
- * After completing I/O on a page, call this routine to update the page
- * flags appropriately
- */
-void page_endio(struct page *page, bool is_write, int err)
-{
- struct folio *folio = page_folio(page);
-
- if (!is_write) {
- if (!err) {
- folio_mark_uptodate(folio);
- } else {
- folio_clear_uptodate(folio);
- folio_set_error(folio);
- }
- folio_unlock(folio);
- } else {
- if (err) {
- struct address_space *mapping;
-
- folio_set_error(folio);
- mapping = folio_mapping(folio);
- if (mapping)
- mapping_set_error(mapping, err);
- }
- folio_end_writeback(folio);
- }
-}
-EXPORT_SYMBOL_GPL(page_endio);
-
/**
* __folio_lock - Get a lock on the folio, assuming we need to sleep to get it.
* @folio: The folio to lock
--
2.34.1

2023-03-22 14:35:39

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC v2 4/5] mpage: use folios in bio end_io handler

On Wed, Mar 22, 2023 at 02:50:12PM +0100, Pankaj Raghav wrote:
> static void mpage_write_end_io(struct bio *bio)
> {
> - struct bio_vec *bv;
> - struct bvec_iter_all iter_all;
> + struct folio_iter fi;
> + int err = blk_status_to_errno(bio->bi_status);
>
> - bio_for_each_segment_all(bv, bio, iter_all) {
> - struct page *page = bv->bv_page;
> - page_endio(page, REQ_OP_WRITE,
> - blk_status_to_errno(bio->bi_status));
> + bio_for_each_folio_all(fi, bio) {
> + struct folio *folio = fi.folio;
> +
> + if (err) {
> + struct address_space *mapping;
> +
> + folio_set_error(folio);
> + mapping = folio_mapping(folio);
> + if (mapping)
> + mapping_set_error(mapping, err);

The folio is known to belong to this mapping and can't be truncated
while under writeback. So it's safe to do:

folio_set_error(folio);
mapping_set_error(folio->mapping, err);

I'm not even sure I'd bother to pull folio out of the fi.

bio_for_each_folio_all(fi, bio) {
if (err) {
folio_set_error(fi.folio);
mapping_set_error(fi.folio->mapping, err);
}
folio_end_writeback(fi.folio);
}

2023-03-22 19:18:37

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC v2 0/5] remove page_endio()

On Wed, Mar 22, 2023 at 02:50:08PM +0100, Pankaj Raghav wrote:
> It was decided to remove the page_endio() as per the previous RFC
> discussion[1] of this series and move that functionality into the caller
> itself. One of the side benefit of doing that is the callers have been
> modified to directly work on folios as page_endio() already worked on
> folios.
>
> mpage changes were tested with a simple boot testing. zram and orangefs is
> only build tested. No functional changes were introduced as a part of
> this AFAIK.
>
> Open questions:
> - Willy pointed out that the calls to folio_set_error() and
> folio_clear_uptodate() are not needed anymore in the read path when an
> error happens[2]. I still don't understand 100% why they aren't needed
> anymore as I see those functions are still called in iomap. It will be
> good to put that rationale as a part of the commit message.

page_endio() was generic. It needed to handle a lot of cases. When it's
being inlined into various completion routines, we know which cases we
need to handle and can omit all the cases which we don't.

We know the uptodate flag is clear. If the uptodate flag is set,
we don't call the filesystem's read path. Since we know it's clear,
we don't need to clear it.

We don't need to set the error flag. Only some filesystems still use
the error flag, and orangefs isn't one of them. I'd like to get rid
of the error flag altogether, and I've sent patches in the past which
get us a lot closer to that desired outcome. Not sure we're there yet.
Regardless, generic code doesn't check the error flag.

2023-03-23 14:36:09

by Mike Marshall

[permalink] [raw]
Subject: Re: [RFC v2 0/5] remove page_endio()

I have tested this patch on orangefs on top of 6.3.0-rc3, no
regressions.

It is very easy to build a single host orangefs test system on
a vm. There are instructions in orangefs.rst, and also I'd
be glad to help make them better...

-Mike

On Wed, Mar 22, 2023 at 9:50 AM Pankaj Raghav <[email protected]> wrote:
>
> It was decided to remove the page_endio() as per the previous RFC
> discussion[1] of this series and move that functionality into the caller
> itself. One of the side benefit of doing that is the callers have been
> modified to directly work on folios as page_endio() already worked on
> folios.
>
> mpage changes were tested with a simple boot testing. zram and orangefs is
> only build tested. No functional changes were introduced as a part of
> this AFAIK.
>
> Open questions:
> - Willy pointed out that the calls to folio_set_error() and
> folio_clear_uptodate() are not needed anymore in the read path when an
> error happens[2]. I still don't understand 100% why they aren't needed
> anymore as I see those functions are still called in iomap. It will be
> good to put that rationale as a part of the commit message.
>
> [1] https://lore.kernel.org/linux-mm/[email protected]/
> [2] https://lore.kernel.org/linux-mm/ZBSH6Uq6IIXON%[email protected]/
>
> Pankaj Raghav (5):
> zram: remove zram_page_end_io function
> orangefs: use folios in orangefs_readahead
> mpage: split bi_end_io callback for reads and writes
> mpage: use folios in bio end_io handler
> filemap: remove page_endio()
>
> drivers/block/zram/zram_drv.c | 13 +----------
> fs/mpage.c | 44 ++++++++++++++++++++++++++++-------
> fs/orangefs/inode.c | 9 +++----
> include/linux/pagemap.h | 2 --
> mm/filemap.c | 30 ------------------------
> 5 files changed, 42 insertions(+), 56 deletions(-)
>
> --
> 2.34.1
>

2023-03-23 15:10:47

by Pankaj Raghav

[permalink] [raw]
Subject: Re: [RFC v2 0/5] remove page_endio()

>> Open questions:
>> - Willy pointed out that the calls to folio_set_error() and
>> folio_clear_uptodate() are not needed anymore in the read path when an
>> error happens[2]. I still don't understand 100% why they aren't needed
>> anymore as I see those functions are still called in iomap. It will be
>> good to put that rationale as a part of the commit message.
>
> page_endio() was generic. It needed to handle a lot of cases. When it's
> being inlined into various completion routines, we know which cases we
> need to handle and can omit all the cases which we don't.
>
> We know the uptodate flag is clear. If the uptodate flag is set,
> we don't call the filesystem's read path. Since we know it's clear,
> we don't need to clear it.
>
Got it.

> We don't need to set the error flag. Only some filesystems still use
> the error flag, and orangefs isn't one of them. I'd like to get rid
> of the error flag altogether, and I've sent patches in the past which
> get us a lot closer to that desired outcome. Not sure we're there yet.
> Regardless, generic code doesn't check the error flag.

Thanks for the explanation. I think found the series you are referring here.

https://lore.kernel.org/linux-mm/[email protected]/#t

I see orangefs is still setting the error flag in orangefs_read_folio(), so
it should be removed at some point?

I also changed mpage to **not set** the error flag in the read path. It does beg
the question whether block_read_full_folio() and iomap_finish_folio_read() should
also follow the suit.

--
Pankaj

2023-03-23 15:51:27

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC v2 0/5] remove page_endio()

On Thu, Mar 23, 2023 at 04:00:37PM +0100, Pankaj Raghav wrote:
> > We don't need to set the error flag. Only some filesystems still use
> > the error flag, and orangefs isn't one of them. I'd like to get rid
> > of the error flag altogether, and I've sent patches in the past which
> > get us a lot closer to that desired outcome. Not sure we're there yet.
> > Regardless, generic code doesn't check the error flag.
>
> Thanks for the explanation. I think found the series you are referring here.
>
> https://lore.kernel.org/linux-mm/[email protected]/#t
>
> I see orangefs is still setting the error flag in orangefs_read_folio(), so
> it should be removed at some point?

Yes, OrangeFS only sets the error flag, it never checks it, so it never
needs to set it.

> I also changed mpage to **not set** the error flag in the read path. It does beg
> the question whether block_read_full_folio() and iomap_finish_folio_read() should
> also follow the suit.

Wrong. mpage is used by filesystems which *DO* check the error flag.
You can't remove it being set until they're fixed to not check it.

2023-03-23 16:32:31

by Pankaj Raghav

[permalink] [raw]
Subject: Re: [RFC v2 0/5] remove page_endio()

>> I also changed mpage to **not set** the error flag in the read path. It does beg
>> the question whether block_read_full_folio() and iomap_finish_folio_read() should
>> also follow the suit.
>
> Wrong. mpage is used by filesystems which *DO* check the error flag.
> You can't remove it being set until they're fixed to not check it.
Got it. I think after your explanation on the Error flag, it makes sense why mpage
needs to set the error flag, for now. I will change it as a part of the next version
along with the other comment on Patch 4.

2023-03-23 16:33:29

by Pankaj Raghav

[permalink] [raw]
Subject: Re: [RFC v2 0/5] remove page_endio()

On 2023-03-23 15:30, Mike Marshall wrote:
> I have tested this patch on orangefs on top of 6.3.0-rc3, no
> regressions.
>

Perfect. Thanks a lot. Could I get a Tested-by from you for the
current change?

> It is very easy to build a single host orangefs test system on
> a vm. There are instructions in orangefs.rst, and also I'd
> be glad to help make them better...
>
Sounds good. I can do it next time if I change the current code.