LinuxLists.cc - Re: [PATCH net-next v1 12/12] mm: page_frag: update documentation and maintainer for page

2024-04-07 18:14:13

Subject: Re: [PATCH net-next v1 12/12] mm: page_frag: update documentation and maintainer for page_frag

On Sun, 2024-04-07 at 21:08 +0800, Yunsheng Lin wrote:
> Update documentation about design, implementation and API usages
> for page_frag.
>
> Also update MAINTAINERS for page_frag. Alexander seems to be the
> orginal author for page_frag, we can add him to the MAINTAINERS
> later if we have an ack from him.
>
> CC: Alexander Duyck <[email protected]>
> Signed-off-by: Yunsheng Lin <[email protected]>

Again, this seems more like 2 different pathches at least. One for the
Documentation and MAINTAINERS changes, and one for the function
documentation.

> ---
> Documentation/mm/page_frags.rst | 115 ++++++++++++++++++----------
> MAINTAINERS | 10 +++
> include/linux/page_frag_cache.h | 128 ++++++++++++++++++++++++++++++++
> mm/page_frag_cache.c | 51 ++++++++++---
> 4 files changed, 256 insertions(+), 48 deletions(-)
>
> diff --git a/Documentation/mm/page_frags.rst b/Documentation/mm/page_frags.rst
> index 503ca6cdb804..77256dfb58bf 100644
> --- a/Documentation/mm/page_frags.rst
> +++ b/Documentation/mm/page_frags.rst
> @@ -1,43 +1,80 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> ==============
> Page fragments
> ==============
>
> -A page fragment is an arbitrary-length arbitrary-offset area of memory
> -which resides within a 0 or higher order compound page. Multiple
> -fragments within that page are individually refcounted, in the page's
> -reference counter.
> -
> -The page_frag functions, page_frag_alloc and page_frag_free, provide a
> -simple allocation framework for page fragments. This is used by the
> -network stack and network device drivers to provide a backing region of
> -memory for use as either an sk_buff->head, or to be used in the "frags"
> -portion of skb_shared_info.
> -
> -In order to make use of the page fragment APIs a backing page fragment
> -cache is needed. This provides a central point for the fragment allocation
> -and tracks allows multiple calls to make use of a cached page. The
> -advantage to doing this is that multiple calls to get_page can be avoided
> -which can be expensive at allocation time. However due to the nature of
> -this caching it is required that any calls to the cache be protected by
> -either a per-cpu limitation, or a per-cpu limitation and forcing interrupts
> -to be disabled when executing the fragment allocation.
> -
> -The network stack uses two separate caches per CPU to handle fragment
> -allocation. The netdev_alloc_cache is used by callers making use of the
> -netdev_alloc_frag and __netdev_alloc_skb calls. The napi_alloc_cache is
> -used by callers of the __napi_alloc_frag and napi_alloc_skb calls. The
> -main difference between these two calls is the context in which they may be
> -called. The "netdev" prefixed functions are usable in any context as these
> -functions will disable interrupts, while the "napi" prefixed functions are
> -only usable within the softirq context.
> -
> -Many network device drivers use a similar methodology for allocating page
> -fragments, but the page fragments are cached at the ring or descriptor
> -level. In order to enable these cases it is necessary to provide a generic
> -way of tearing down a page cache. For this reason __page_frag_cache_drain
> -was implemented. It allows for freeing multiple references from a single
> -page via a single call. The advantage to doing this is that it allows for
> -cleaning up the multiple references that were added to a page in order to
> -avoid calling get_page per allocation.
> -
> -Alexander Duyck, Nov 29, 2016.

What is the point of removing this just to add it to a C file further
down in the diff? Honestly I am not a fan of all the noise this is
adding to these diffs. Can we do a little less moving of lines for the
sake of moving them? All it does is pollute the git blame if you try to
figure out the origin of the lines.

> +.. kernel-doc:: mm/page_frag_cache.c
> + :doc: page_frag allocator
> +
> +Architecture overview
> +=====================
> +
> +.. code-block:: none
> +
> + +----------------------+
> + | page_frag API caller |
> + +----------------------+
> + ^
> + |
> + |
> + |
> + v
> + +----------------------------------------------+
> + | request page fragment |
> + +----------------------------------------------+
> + ^ ^
> + | |
> + | Cache empty or not enough |
> + | |
> + v |
> + +--------------------------------+ |
> + | refill cache with order 3 page | |
> + +--------------------------------+ |
> + ^ ^ |
> + | | |
> + | | Refill failed |
> + | | | Cache is enough
> + | | |
> + | v |
> + | +----------------------------------+ |
> + | | refill cache with order 0 page | |
> + | +----------------------------------+ |
> + | ^ |
> + | Refill succeed | |
> + | | Refill succeed |
> + | | |
> + v v v
> + +----------------------------------------------+
> + | allocate fragment from cache |
> + +----------------------------------------------+
> +

+1 for the simple visualization of how this works.

> +API interface
> +=============
> +As the design and implementation of page_frag API, the allocation side does not
> +allow concurrent calling, it is assumed that the caller must ensure there is not
> +concurrent alloc calling to the same page_frag_cache instance by using it's own
> +lock or rely on some lockless guarantee like NAPI softirq.
> +
> +Depending on different use cases, callers expecting to deal with va, page or
> +both va and page for them may call page_frag_alloc_va(), page_frag_alloc_pg(),
> +or page_frag_alloc() accordingly.
> +

So the new documentation is good up to here.

> +There is also a use case that need minimum memory in order for forward
> +progressing, but can do better if there is more memory available. Introduce
> +page_frag_alloc_prepare() and page_frag_alloc_commit() related API, the caller
> +requests the minimum memory it need and the prepare API will return the maximum
> +size of the fragment returned, caller need to report back to the page_frag core
> +how much memory it actually use by calling commit API, or not calling the commit
> +API if deciding to not use any memory.
> +

This part is as clear as mud to me. It sounds like kind of a convoluted
setup where you are having the caller have to know a fair bit about the
internal structure of the cache and it is essentially checking the
state and then performing a commit. Not a huge fan. I would almost
prefer to see something more like what we used to do with msix where
you just had a range you could request and if it can't give you at
least the minimum it fails.

I assume the patch is somewhere here in the set. Will take a look at it
later.

> +.. kernel-doc:: include/linux/page_frag_cache.h
> + :identifiers: page_frag_cache_init page_frag_cache_is_pfmemalloc
> + page_frag_alloc_va __page_frag_alloc_va_align
> + page_frag_alloc_va_align page_frag_alloc_va_prepare
> + page_frag_alloc_va_prepare_align page_frag_alloc_pg_prepare
> + page_frag_alloc_prepare page_frag_alloc_commit
> + page_frag_alloc_commit_noref page_frag_free_va
> +
> +.. kernel-doc:: mm/page_frag_cache.c
> + :identifiers: page_frag_cache_drain
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4745ea94d463..2f84aba59428 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16683,6 +16683,16 @@ F: mm/page-writeback.c
> F: mm/readahead.c
> F: mm/truncate.c
>
> +PAGE FRAG
> +M: Yunsheng Lin <[email protected]>
> +L: [email protected]
> +L: [email protected]
> +S: Supported
> +F: Documentation/mm/page_frags.rst
> +F: include/linux/page_frag_cache.h
> +F: mm/page_frag_cache.c
> +F: mm/page_frag_test.c
> +

I would appreciate it if you could add me as I usually am having to
deal with issues people have with this anyway. You can probably just go
with:
Alexander Duyck <[email protected]>

> PAGE POOL
> M: Jesper Dangaard Brouer <[email protected]>
> M: Ilias Apalodimas <[email protected]>
> diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h
> index 28185969cd2c..d8edbecdd179 100644
> --- a/include/linux/page_frag_cache.h
> +++ b/include/linux/page_frag_cache.h
> @@ -31,11 +31,23 @@ struct page_frag_cache {
> #endif
> };
>
> +/**
> + * page_frag_cache_init() - Init page_frag cache.
> + * @nc: page_frag cache from which to init
> + *
> + * Inline helper to init the page_frag cache.
> + */
> static inline void page_frag_cache_init(struct page_frag_cache *nc)
> {
> nc->va = NULL;
> }
>
> +/**
> + * page_frag_cache_is_pfmemalloc() - Check for pfmemalloc.
> + * @nc: page_frag cache from which to check
> + *
> + * Used to check if the current page in page_frag cache is pfmemalloc'ed.
> + */
> static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc)
> {
> return !!nc->pfmemalloc;
> @@ -46,6 +58,17 @@ void __page_frag_cache_drain(struct page *page, unsigned int count);
> void *page_frag_cache_refill(struct page_frag_cache *nc, unsigned int fragsz,
> gfp_t gfp_mask);
>
> +/**
> + * page_frag_alloc_va() - Alloc a page fragment.
> + * @nc: page_frag cache from which to allocate
> + * @fragsz: the requested fragment size
> + * @gfp_mask: the allocation gfp to use when cache need to be refilled
> + *
> + * Get a page fragment from page_frag cache.
> + *
> + * Return:
> + * Return va of the page fragment, otherwise return NULL.
> + */
> static inline void *page_frag_alloc_va(struct page_frag_cache *nc,
> unsigned int fragsz, gfp_t gfp_mask)
> {
> @@ -63,6 +86,19 @@ static inline void *page_frag_alloc_va(struct page_frag_cache *nc,
> return va + offset;
> }
>
> +/**
> + * __page_frag_alloc_va_align() - Alloc a page fragment with aligning
> + * requirement.
> + * @nc: page_frag cache from which to allocate
> + * @fragsz: the requested fragment size
> + * @gfp_mask: the allocation gfp to use when cache need to be refilled
> + * @align: the requested aligning requirement
> + *
> + * Get a page fragment from page_frag cache with aligning requirement.
> + *
> + * Return:
> + * Return va of the page fragment, otherwise return NULL.
> + */
> static inline void *__page_frag_alloc_va_align(struct page_frag_cache *nc,
> unsigned int fragsz,
> gfp_t gfp_mask,
> @@ -75,6 +111,19 @@ static inline void *__page_frag_alloc_va_align(struct page_frag_cache *nc,
> return page_frag_alloc_va(nc, fragsz, gfp_mask);
> }
>
> +/**
> + * page_frag_alloc_va_align() - Alloc a page fragment with aligning requirement.
> + * @nc: page_frag cache from which to allocate
> + * @fragsz: the requested fragment size
> + * @gfp_mask: the allocation gfp to use when cache need to be refilled
> + * @align: the requested aligning requirement
> + *
> + * WARN_ON_ONCE() checking for align and fragsz before getting a page fragment
> + * from page_frag cache with aligning requirement.
> + *
> + * Return:
> + * Return va of the page fragment, otherwise return NULL.
> + */
> static inline void *page_frag_alloc_va_align(struct page_frag_cache *nc,
> unsigned int fragsz,
> gfp_t gfp_mask,
> @@ -86,6 +135,19 @@ static inline void *page_frag_alloc_va_align(struct page_frag_cache *nc,
> return __page_frag_alloc_va_align(nc, fragsz, gfp_mask, align);
> }
>
> +/**
> + * page_frag_alloc_va_prepare() - Prepare allocing a page fragment.
> + * @nc: page_frag cache from which to prepare
> + * @offset: out as the offset of the page fragment
> + * @size: in as the requested size, out as the available size
> + * @gfp_mask: the allocation gfp to use when cache need to be refilled
> + *
> + * Prepare a page fragment with minimum size of ‘size’, 'size' is also used to
> + * report the maximum size of the page fragment the caller can use.
> + *
> + * Return:
> + * Return va of the page fragment, otherwise return NULL.
> + */
> static inline void *page_frag_alloc_va_prepare(struct page_frag_cache *nc,
> unsigned int *offset,
> unsigned int *size,
> @@ -108,6 +170,21 @@ static inline void *page_frag_alloc_va_prepare(struct page_frag_cache *nc,
> return va + *offset;
> }
>
> +/**
> + * page_frag_alloc_va_prepare_align() - Prepare allocing a page fragment with
> + * aligning requirement.
> + * @nc: page_frag cache from which to prepare
> + * @offset: out as the offset of the page fragment
> + * @size: in as the requested size, out as the available size
> + * @align: the requested aligning requirement
> + * @gfp_mask: the allocation gfp to use when cache need to be refilled
> + *
> + * Prepare an aligned page fragment with minimum size of ‘size’, 'size' is also
> + * used to report the maximum size of the page fragment the caller can use.
> + *
> + * Return:
> + * Return va of the page fragment, otherwise return NULL.
> + */
> static inline void *page_frag_alloc_va_prepare_align(struct page_frag_cache *nc,
> unsigned int *offset,
> unsigned int *size,
> @@ -144,6 +221,19 @@ static inline void *__page_frag_alloc_pg_prepare(struct page_frag_cache *nc,
> return va;
> }
>
> +/**
> + * page_frag_alloc_pg_prepare - Prepare allocing a page fragment.
> + * @nc: page_frag cache from which to prepare
> + * @offset: out as the offset of the page fragment
> + * @size: in as the requested size, out as the available size
> + * @gfp: the allocation gfp to use when cache need to be refilled
> + *
> + * Prepare a page fragment with minimum size of ‘size’, 'size' is also used to
> + * report the maximum size of the page fragment the caller can use.
> + *
> + * Return:
> + * Return the page fragment, otherwise return NULL.
> + */
> #define page_frag_alloc_pg_prepare(nc, offset, size, gfp) \
> ({ \
> struct page *__page = NULL; \
> @@ -179,6 +269,21 @@ static inline void *__page_frag_alloc_prepare(struct page_frag_cache *nc,
> return nc_va;
> }
>
> +/**
> + * page_frag_alloc_prepare - Prepare allocing a page fragment.
> + * @nc: page_frag cache from which to prepare
> + * @offset: out as the offset of the page fragment
> + * @size: in as the requested size, out as the available size
> + * @va: out as the va of the returned page fragment
> + * @gfp: the allocation gfp to use when cache need to be refilled
> + *
> + * Prepare a page fragment with minimum size of ‘size’, 'size' is also used to
> + * report the maximum size of the page fragment. Return both 'page' and 'va' of
> + * the fragment to the caller.
> + *
> + * Return:
> + * Return the page fragment, otherwise return NULL.
> + */
> #define page_frag_alloc_prepare(nc, offset, size, va, gfp) \
> ({ \
> struct page *__page = NULL; \
> @@ -191,6 +296,14 @@ static inline void *__page_frag_alloc_prepare(struct page_frag_cache *nc,
> __page; \
> })
>
> +/**
> + * page_frag_alloc_commit - Commit allocing a page fragment.
> + * @nc: page_frag cache from which to commit
> + * @offset: offset of the page fragment
> + * @size: size of the page fragment has been used
> + *
> + * Commit the alloc preparing by passing offset and the actual used size.
> + */
> static inline void page_frag_alloc_commit(struct page_frag_cache *nc,
> unsigned int offset,
> unsigned int size)
> @@ -199,6 +312,17 @@ static inline void page_frag_alloc_commit(struct page_frag_cache *nc,
> nc->offset = offset + size;
> }
>
> +/**
> + * page_frag_alloc_commit_noref - Commit allocing a page fragment without taking
> + * page refcount.
> + * @nc: page_frag cache from which to commit
> + * @offset: offset of the page fragment
> + * @size: size of the page fragment has been used
> + *
> + * Commit the alloc preparing by passing offset and the actual used size, but
> + * not taking page refcount. Mostly used for fragmemt coaleasing case when the
> + * current fragmemt can share the same refcount with previous fragmemt.
> + */
> static inline void page_frag_alloc_commit_noref(struct page_frag_cache *nc,
> unsigned int offset,
> unsigned int size)
> @@ -206,6 +330,10 @@ static inline void page_frag_alloc_commit_noref(struct page_frag_cache *nc,
> nc->offset = offset + size;
> }
>
> +/**
> + * page_frag_free_va - Free a page fragment by va.
> + * @addr: va of page fragment to be freed
> + */
> void page_frag_free_va(void *addr);
>
> #endif
> diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c
> index cbd0ed82a596..0c76ec006c22 100644
> --- a/mm/page_frag_cache.c
> +++ b/mm/page_frag_cache.c
> @@ -1,15 +1,44 @@
> // SPDX-License-Identifier: GPL-2.0-only
> -/* Page fragment allocator
> +
> +/**
> + * DOC: page_frag allocator
> + *
> + * A page fragment is an arbitrary-length arbitrary-offset area of memory which
> + * resides within a 0 or higher order compound page. Multiple fragments within
> + * that page are individually refcounted, in the page's reference counter.
> + *
> + * The page_frag functions, page_frag_alloc* and page_frag_free*, provide a
> + * simple allocation framework for page fragments. This is used by the network
> + * stack and network device drivers to provide a backing region of memory for
> + * use as either an sk_buff->head, or to be used in the "frags" portion of
> + * skb_shared_info.
> *
> - * Page Fragment:
> - * An arbitrary-length arbitrary-offset area of memory which resides within a
> - * 0 or higher order page. Multiple fragments within that page are
> - * individually refcounted, in the page's reference counter.
> + * In order to make use of the page fragment APIs a backing page fragment cache
> + * is needed. This provides a central point for the fragment allocation and
> + * tracks allows multiple calls to make use of a cached page. The advantage to
> + * doing this is that multiple calls to get_page can be avoided which can be
> + * expensive at allocation time. However due to the nature of this caching it
> + * is required that any calls to the cache be protected by either a per-cpu
> + * limitation, or a per-cpu limitation and forcing interrupts to be disabled
> + * when executing the fragment allocation.
> *
> - * The page_frag functions provide a simple allocation framework for page
> - * fragments. This is used by the network stack and network device drivers to
> - * provide a backing region of memory for use as either an sk_buff->head, or to
> - * be used in the "frags" portion of skb_shared_info.
> + * The network stack uses two separate caches per CPU to handle fragment
> + * allocation. The netdev_alloc_cache is used by callers making use of the
> + * netdev_alloc_frag and __netdev_alloc_skb calls. The napi_alloc_cache is
> + * used by callers of the __napi_alloc_frag and napi_alloc_skb calls. The
> + * main difference between these two calls is the context in which they may be
> + * called. The "netdev" prefixed functions are usable in any context as these
> + * functions will disable interrupts, while the "napi" prefixed functions are
> + * only usable within the softirq context.
> + *
> + * Many network device drivers use a similar methodology for allocating page
> + * fragments, but the page fragments are cached at the ring or descriptor
> + * level. In order to enable these cases it is necessary to provide a generic
> + * way of tearing down a page cache. For this reason __page_frag_cache_drain
> + * was implemented. It allows for freeing multiple references from a single
> + * page via a single call. The advantage to doing this is that it allows for
> + * cleaning up the multiple references that were added to a page in order to
> + * avoid calling get_page per allocation.
> */
>

Again, not a huge fan of moving this. It would be better to just leave
it where it was and add your documentation onto it.

> #include <linux/export.h>
> @@ -57,6 +86,10 @@ static bool __page_frag_cache_refill(struct page_frag_cache *nc,
> return true;
> }
>
> +/**
> + * page_frag_cache_drain - Drain the current page from page_frag cache.
> + * @nc: page_frag cache from which to drain
> + */
> void page_frag_cache_drain(struct page_frag_cache *nc)
> {
> if (!nc->va)

2024-04-08 13:45:37

by Yunsheng Lin

[permalink] [raw]

Subject: Re: [PATCH net-next v1 12/12] mm: page_frag: update documentation and maintainer for page_frag

On 2024/4/8 2:13, Alexander H Duyck wrote:
> On Sun, 2024-04-07 at 21:08 +0800, Yunsheng Lin wrote:
>> Update documentation about design, implementation and API usages
>> for page_frag.
>>
>> Also update MAINTAINERS for page_frag. Alexander seems to be the
>> orginal author for page_frag, we can add him to the MAINTAINERS
>> later if we have an ack from him.
>>
>> CC: Alexander Duyck <[email protected]>
>> Signed-off-by: Yunsheng Lin <[email protected]>
>
> Again, this seems more like 2 different pathches at least. One for the
> Documentation and MAINTAINERS changes, and one for the function
> documentation.

Sure.

>
>> ---
>> Documentation/mm/page_frags.rst | 115 ++++++++++++++++++----------
>> MAINTAINERS | 10 +++
>> include/linux/page_frag_cache.h | 128 ++++++++++++++++++++++++++++++++
>> mm/page_frag_cache.c | 51 ++++++++++---
>> 4 files changed, 256 insertions(+), 48 deletions(-)
>>
>> diff --git a/Documentation/mm/page_frags.rst b/Documentation/mm/page_frags.rst
>> index 503ca6cdb804..77256dfb58bf 100644
>> --- a/Documentation/mm/page_frags.rst
>> +++ b/Documentation/mm/page_frags.rst
>> @@ -1,43 +1,80 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> ==============
>> Page fragments
>> ==============
>>
>> -A page fragment is an arbitrary-length arbitrary-offset area of memory
>> -which resides within a 0 or higher order compound page. Multiple
>> -fragments within that page are individually refcounted, in the page's
>> -reference counter.
>> -
>> -The page_frag functions, page_frag_alloc and page_frag_free, provide a
>> -simple allocation framework for page fragments. This is used by the
>> -network stack and network device drivers to provide a backing region of
>> -memory for use as either an sk_buff->head, or to be used in the "frags"
>> -portion of skb_shared_info.
>> -
>> -In order to make use of the page fragment APIs a backing page fragment
>> -cache is needed. This provides a central point for the fragment allocation
>> -and tracks allows multiple calls to make use of a cached page. The
>> -advantage to doing this is that multiple calls to get_page can be avoided
>> -which can be expensive at allocation time. However due to the nature of
>> -this caching it is required that any calls to the cache be protected by
>> -either a per-cpu limitation, or a per-cpu limitation and forcing interrupts
>> -to be disabled when executing the fragment allocation.
>> -
>> -The network stack uses two separate caches per CPU to handle fragment
>> -allocation. The netdev_alloc_cache is used by callers making use of the
>> -netdev_alloc_frag and __netdev_alloc_skb calls. The napi_alloc_cache is
>> -used by callers of the __napi_alloc_frag and napi_alloc_skb calls. The
>> -main difference between these two calls is the context in which they may be
>> -called. The "netdev" prefixed functions are usable in any context as these
>> -functions will disable interrupts, while the "napi" prefixed functions are
>> -only usable within the softirq context.
>> -
>> -Many network device drivers use a similar methodology for allocating page
>> -fragments, but the page fragments are cached at the ring or descriptor
>> -level. In order to enable these cases it is necessary to provide a generic
>> -way of tearing down a page cache. For this reason __page_frag_cache_drain
>> -was implemented. It allows for freeing multiple references from a single
>> -page via a single call. The advantage to doing this is that it allows for
>> -cleaning up the multiple references that were added to a page in order to
>> -avoid calling get_page per allocation.
>> -
>> -Alexander Duyck, Nov 29, 2016.
>
> What is the point of removing this just to add it to a C file further
> down in the diff? Honestly I am not a fan of all the noise this is
> adding to these diffs. Can we do a little less moving of lines for the
> sake of moving them? All it does is pollute the git blame if you try to
> figure out the origin of the lines.

I was thinking about move the doc related code to file where code is related,
so that author will remember to update the doc when changing the code.
Maybe above does not matter that much?

>
>> +.. kernel-doc:: mm/page_frag_cache.c
>> + :doc: page_frag allocator
>> +
>> +Architecture overview
>> +=====================
>> +
>> +.. code-block:: none
>> +
>> + +----------------------+
>> + | page_frag API caller |
>> + +----------------------+
>> + ^
>> + |
>> + |
>> + |
>> + v
>> + +----------------------------------------------+
>> + | request page fragment |
>> + +----------------------------------------------+
>> + ^ ^
>> + | |
>> + | Cache empty or not enough |
>> + | |
>> + v |
>> + +--------------------------------+ |
>> + | refill cache with order 3 page | |
>> + +--------------------------------+ |
>> + ^ ^ |
>> + | | |
>> + | | Refill failed |
>> + | | | Cache is enough
>> + | | |
>> + | v |
>> + | +----------------------------------+ |
>> + | | refill cache with order 0 page | |
>> + | +----------------------------------+ |
>> + | ^ |
>> + | Refill succeed | |
>> + | | Refill succeed |
>> + | | |
>> + v v v
>> + +----------------------------------------------+
>> + | allocate fragment from cache |
>> + +----------------------------------------------+
>> +
>
> +1 for the simple visualization of how this works.
>
>> +API interface
>> +=============
>> +As the design and implementation of page_frag API, the allocation side does not
>> +allow concurrent calling, it is assumed that the caller must ensure there is not
>> +concurrent alloc calling to the same page_frag_cache instance by using it's own
>> +lock or rely on some lockless guarantee like NAPI softirq.
>> +
>> +Depending on different use cases, callers expecting to deal with va, page or
>> +both va and page for them may call page_frag_alloc_va(), page_frag_alloc_pg(),
>> +or page_frag_alloc() accordingly.
>> +
>
> So the new documentation is good up to here.
>
>> +There is also a use case that need minimum memory in order for forward
>> +progressing, but can do better if there is more memory available. Introduce
>> +page_frag_alloc_prepare() and page_frag_alloc_commit() related API, the caller
>> +requests the minimum memory it need and the prepare API will return the maximum
>> +size of the fragment returned, caller need to report back to the page_frag core
>> +how much memory it actually use by calling commit API, or not calling the commit
>> +API if deciding to not use any memory.
>> +
>
> This part is as clear as mud to me. It sounds like kind of a convoluted
> setup where you are having the caller have to know a fair bit about the
> internal structure of the cache and it is essentially checking the
> state and then performing a commit. Not a huge fan. I would almost
> prefer to see something more like what we used to do with msix where
> you just had a range you could request and if it can't give you at
> least the minimum it fails.>
> I assume the patch is somewhere here in the set. Will take a look at it
> later.

Yes, the API is introduced in patch 9 and used in patch 10.

>
>> +.. kernel-doc:: include/linux/page_frag_cache.h
>> + :identifiers: page_frag_cache_init page_frag_cache_is_pfmemalloc
>> + page_frag_alloc_va __page_frag_alloc_va_align
>> + page_frag_alloc_va_align page_frag_alloc_va_prepare
>> + page_frag_alloc_va_prepare_align page_frag_alloc_pg_prepare
>> + page_frag_alloc_prepare page_frag_alloc_commit
>> + page_frag_alloc_commit_noref page_frag_free_va
>> +
>> +.. kernel-doc:: mm/page_frag_cache.c
>> + :identifiers: page_frag_cache_drain
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 4745ea94d463..2f84aba59428 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -16683,6 +16683,16 @@ F: mm/page-writeback.c
>> F: mm/readahead.c
>> F: mm/truncate.c
>>
>> +PAGE FRAG
>> +M: Yunsheng Lin <[email protected]>
>> +L: [email protected]
>> +L: [email protected]
>> +S: Supported
>> +F: Documentation/mm/page_frags.rst
>> +F: include/linux/page_frag_cache.h
>> +F: mm/page_frag_cache.c
>> +F: mm/page_frag_test.c
>> +
>
> I would appreciate it if you could add me as I usually am having to
> deal with issues people have with this anyway. You can probably just go
> with:
> Alexander Duyck <[email protected]>

Sure, good to your ack here.

>
>> PAGE POOL
>> M: Jesper Dangaard Brouer <[email protected]>
>> M: Ilias Apalodimas <[email protected]>
>> diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h
>> index 28185969cd2c..d8edbecdd179 100644