2023-09-15 12:53:18

by Daniel Gomez

[permalink] [raw]
Subject: [PATCH 1/6] filemap: make the folio order calculation shareable

To make the code that clamps the folio order in the __filemap_get_folio
routine reusable to others, move and merge it to the fgf_set_order
new subroutine (mapping_size_order), so when mapping the size at a
given index, the order calculated is already valid and ready to be
used when order is retrieved from fgp_flags with FGF_GET_ORDER.

Signed-off-by: Daniel Gomez <[email protected]>
---
fs/iomap/buffered-io.c | 6 ++++--
include/linux/pagemap.h | 42 ++++++++++++++++++++++++++++++++++++-----
mm/filemap.c | 8 --------
3 files changed, 41 insertions(+), 15 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index ae8673ce08b1..bfd9a22a9464 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -546,12 +546,14 @@ EXPORT_SYMBOL_GPL(iomap_is_partially_uptodate);
struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos, size_t len)
{
fgf_t fgp = FGP_WRITEBEGIN | FGP_NOFS;
+ pgoff_t index = pos >> PAGE_SHIFT;
+ struct address_space *mapping = iter->inode->i_mapping;

if (iter->flags & IOMAP_NOWAIT)
fgp |= FGP_NOWAIT;
- fgp |= fgf_set_order(len);
+ fgp |= fgf_set_order(mapping, index, len);

- return __filemap_get_folio(iter->inode->i_mapping, pos >> PAGE_SHIFT,
+ return __filemap_get_folio(mapping, index,
fgp, mapping_gfp_mask(iter->inode->i_mapping));
}
EXPORT_SYMBOL_GPL(iomap_get_folio);
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 351c3b7f93a1..7af5636eb32a 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -576,6 +576,39 @@ typedef unsigned int __bitwise fgf_t;

#define FGP_WRITEBEGIN (FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE)

+/**
+ * mapping_size_order - Get maximum folio order for the given file size.
+ * @mapping: Target address_space.
+ * @index: The page index.
+ * @size: The suggested size of the folio to create.
+ *
+ * This returns a high order for folios (when supported) based on the file size
+ * which the mapping currently allows at the given index. The index is relevant
+ * due to alignment considerations the mapping might have. The returned order
+ * may be less than the size passed.
+ *
+ * Return: The order.
+ */
+static inline unsigned int mapping_size_order(struct address_space *mapping,
+ pgoff_t index, size_t size)
+{
+ unsigned int order = ilog2(size);
+
+ if ((order <= PAGE_SHIFT) || (!mapping_large_folio_support(mapping)))
+ return 0;
+ else
+ order = order - PAGE_SHIFT;
+
+ /* If we're not aligned, allocate a smaller folio */
+ if (index & ((1UL << order) - 1))
+ order = __ffs(index);
+
+ order = min_t(size_t, order, MAX_PAGECACHE_ORDER);
+
+ /* Order-1 not supported due to THP dependency */
+ return (order == 1) ? 0 : order;
+}
+
/**
* fgf_set_order - Encode a length in the fgf_t flags.
* @size: The suggested size of the folio to create.
@@ -587,13 +620,12 @@ typedef unsigned int __bitwise fgf_t;
* due to alignment constraints, memory pressure, or the presence of
* other folios at nearby indices.
*/
-static inline fgf_t fgf_set_order(size_t size)
+static inline fgf_t fgf_set_order(struct address_space *mapping, pgoff_t index,
+ size_t size)
{
- unsigned int shift = ilog2(size);
+ unsigned int order = mapping_size_order(mapping, index, size);

- if (shift <= PAGE_SHIFT)
- return 0;
- return (__force fgf_t)((shift - PAGE_SHIFT) << 26);
+ return (__force fgf_t)(order << 26);
}

void *filemap_get_entry(struct address_space *mapping, pgoff_t index);
diff --git a/mm/filemap.c b/mm/filemap.c
index 582f5317ff71..e285fffa9bcf 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1917,14 +1917,6 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index,
if (WARN_ON_ONCE(!(fgp_flags & (FGP_LOCK | FGP_FOR_MMAP))))
fgp_flags |= FGP_LOCK;

- if (!mapping_large_folio_support(mapping))
- order = 0;
- if (order > MAX_PAGECACHE_ORDER)
- order = MAX_PAGECACHE_ORDER;
- /* If we're not aligned, allocate a smaller folio */
- if (index & ((1UL << order) - 1))
- order = __ffs(index);
-
do {
gfp_t alloc_gfp = gfp;

--
2.39.2


2023-09-15 17:58:30

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 1/6] filemap: make the folio order calculation shareable

On Fri, Sep 15, 2023 at 09:51:23AM +0000, Daniel Gomez wrote:
> To make the code that clamps the folio order in the __filemap_get_folio
> routine reusable to others, move and merge it to the fgf_set_order
> new subroutine (mapping_size_order), so when mapping the size at a
> given index, the order calculated is already valid and ready to be
> used when order is retrieved from fgp_flags with FGF_GET_ORDER.
>
> Signed-off-by: Daniel Gomez <[email protected]>
> ---
> fs/iomap/buffered-io.c | 6 ++++--
> include/linux/pagemap.h | 42 ++++++++++++++++++++++++++++++++++++-----
> mm/filemap.c | 8 --------
> 3 files changed, 41 insertions(+), 15 deletions(-)

That seems like a lot of extra code to add in order to avoid copying
six lines of code and one comment into the shmem code.

It's not wrong, but it seems like a bad tradeoff to me.

2023-09-18 08:49:33

by Daniel Gomez

[permalink] [raw]
Subject: Re: [PATCH 1/6] filemap: make the folio order calculation shareable

On Fri, Sep 15, 2023 at 02:40:07PM +0100, Matthew Wilcox wrote:
> On Fri, Sep 15, 2023 at 09:51:23AM +0000, Daniel Gomez wrote:
> > To make the code that clamps the folio order in the __filemap_get_folio
> > routine reusable to others, move and merge it to the fgf_set_order
> > new subroutine (mapping_size_order), so when mapping the size at a
> > given index, the order calculated is already valid and ready to be
> > used when order is retrieved from fgp_flags with FGF_GET_ORDER.
> >
> > Signed-off-by: Daniel Gomez <[email protected]>
> > ---
> > fs/iomap/buffered-io.c | 6 ++++--
> > include/linux/pagemap.h | 42 ++++++++++++++++++++++++++++++++++++-----
> > mm/filemap.c | 8 --------
> > 3 files changed, 41 insertions(+), 15 deletions(-)
>
> That seems like a lot of extra code to add in order to avoid copying
> six lines of code and one comment into the shmem code.
>
> It's not wrong, but it seems like a bad tradeoff to me.

I saw some value in sharing the logic but I'll merge this code directly
in shmem and add a comment 'Like filemap_' similar to the one in
shmem_add_to_page_cache.

Thanks for the quick feedback and review Matthew. I'll do a V2 with the
changes and retest the series again using latest next tag.

2023-09-18 18:52:34

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH 1/6] filemap: make the folio order calculation shareable

On Mon, Sep 18, 2023 at 07:24:55PM +0100, Matthew Wilcox wrote:
> On Mon, Sep 18, 2023 at 11:09:00AM -0700, Luis Chamberlain wrote:
> > On Fri, Sep 15, 2023 at 02:40:07PM +0100, Matthew Wilcox wrote:
> > > On Fri, Sep 15, 2023 at 09:51:23AM +0000, Daniel Gomez wrote:
> > > > To make the code that clamps the folio order in the __filemap_get_folio
> > > > routine reusable to others, move and merge it to the fgf_set_order
> > > > new subroutine (mapping_size_order), so when mapping the size at a
> > > > given index, the order calculated is already valid and ready to be
> > > > used when order is retrieved from fgp_flags with FGF_GET_ORDER.
> > > >
> > > > Signed-off-by: Daniel Gomez <[email protected]>
> > > > ---
> > > > fs/iomap/buffered-io.c | 6 ++++--
> > > > include/linux/pagemap.h | 42 ++++++++++++++++++++++++++++++++++++-----
> > > > mm/filemap.c | 8 --------
> > > > 3 files changed, 41 insertions(+), 15 deletions(-)
> > >
> > > That seems like a lot of extra code to add in order to avoid copying
> > > six lines of code and one comment into the shmem code.
> > >
> > > It's not wrong, but it seems like a bad tradeoff to me.
> >
> > The suggestion to merge came from me, mostly based on later observations
> > that in the future we may want to extend this with a min order to ensure
> > the index is aligned the the order. This check would only be useful for
> > buffred IO for iomap, readahead. It has me wondering if buffer-heads
> > support for large order folios come around would we a similar check
> > there?
> >
> > So Willy, you would know better if and when a shared piece of code would
> > be best with all these things in mind.
>
> In my mind, this is fundamentally code which belongs in the page cache
> rather than in individual filesystems. The fly in the ointment is that
> shmem has forked the page cache in order to do its own slightly
> specialised thing.

Do we do any effort *now* try to to not make that situation worse? This
just being one example.

> I don't see the buffer_head connection;

I haven't reviewed yet Hannes' patches yet but I was wondering if for bf
there was also a loop to target a high order and retry until you get the
min order allowed.

> shmem is
> an extremely special case, and we shouldn't mess around with other
> filesystems to avoid changing shmem.
>
> Ideally, we'd reunify (parts of) shmem and the regular page cache, but
> that's a lot of work, probably involving the swap layer changing.

Well so indeed swap effort will be next, so addressing if we should
not make the situation worse as we add more code would be good to know.

Luis

2023-09-19 00:22:17

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH 1/6] filemap: make the folio order calculation shareable

On Fri, Sep 15, 2023 at 02:40:07PM +0100, Matthew Wilcox wrote:
> On Fri, Sep 15, 2023 at 09:51:23AM +0000, Daniel Gomez wrote:
> > To make the code that clamps the folio order in the __filemap_get_folio
> > routine reusable to others, move and merge it to the fgf_set_order
> > new subroutine (mapping_size_order), so when mapping the size at a
> > given index, the order calculated is already valid and ready to be
> > used when order is retrieved from fgp_flags with FGF_GET_ORDER.
> >
> > Signed-off-by: Daniel Gomez <[email protected]>
> > ---
> > fs/iomap/buffered-io.c | 6 ++++--
> > include/linux/pagemap.h | 42 ++++++++++++++++++++++++++++++++++++-----
> > mm/filemap.c | 8 --------
> > 3 files changed, 41 insertions(+), 15 deletions(-)
>
> That seems like a lot of extra code to add in order to avoid copying
> six lines of code and one comment into the shmem code.
>
> It's not wrong, but it seems like a bad tradeoff to me.

The suggestion to merge came from me, mostly based on later observations
that in the future we may want to extend this with a min order to ensure
the index is aligned the the order. This check would only be useful for
buffred IO for iomap, readahead. It has me wondering if buffer-heads
support for large order folios come around would we a similar check
there?

So Willy, you would know better if and when a shared piece of code would
be best with all these things in mind.

Luis

2023-09-19 05:58:39

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 1/6] filemap: make the folio order calculation shareable

On Mon, Sep 18, 2023 at 11:09:00AM -0700, Luis Chamberlain wrote:
> On Fri, Sep 15, 2023 at 02:40:07PM +0100, Matthew Wilcox wrote:
> > On Fri, Sep 15, 2023 at 09:51:23AM +0000, Daniel Gomez wrote:
> > > To make the code that clamps the folio order in the __filemap_get_folio
> > > routine reusable to others, move and merge it to the fgf_set_order
> > > new subroutine (mapping_size_order), so when mapping the size at a
> > > given index, the order calculated is already valid and ready to be
> > > used when order is retrieved from fgp_flags with FGF_GET_ORDER.
> > >
> > > Signed-off-by: Daniel Gomez <[email protected]>
> > > ---
> > > fs/iomap/buffered-io.c | 6 ++++--
> > > include/linux/pagemap.h | 42 ++++++++++++++++++++++++++++++++++++-----
> > > mm/filemap.c | 8 --------
> > > 3 files changed, 41 insertions(+), 15 deletions(-)
> >
> > That seems like a lot of extra code to add in order to avoid copying
> > six lines of code and one comment into the shmem code.
> >
> > It's not wrong, but it seems like a bad tradeoff to me.
>
> The suggestion to merge came from me, mostly based on later observations
> that in the future we may want to extend this with a min order to ensure
> the index is aligned the the order. This check would only be useful for
> buffred IO for iomap, readahead. It has me wondering if buffer-heads
> support for large order folios come around would we a similar check
> there?
>
> So Willy, you would know better if and when a shared piece of code would
> be best with all these things in mind.

In my mind, this is fundamentally code which belongs in the page cache
rather than in individual filesystems. The fly in the ointment is that
shmem has forked the page cache in order to do its own slightly
specialised thing. I don't see the buffer_head connection; shmem is
an extremely special case, and we shouldn't mess around with other
filesystems to avoid changing shmem.

Ideally, we'd reunify (parts of) shmem and the regular page cache, but
that's a lot of work, probably involving the swap layer changing.