MIME-Version: 1.0
References: <20230127101627.891614-1-ilias.apalodimas@linaro.org>
In-Reply-To: <20230127101627.891614-1-ilias.apalodimas@linaro.org>
From:   Alexander Duyck <alexander.duyck@gmail.com>
Date:   Fri, 27 Jan 2023 07:36:32 -0800
Message-ID: <CAKgT0UdGMbaNDX4xEknXa9MAXAW6PoU1y4ogVFycn6jBFuDYiQ@mail.gmail.com>
Subject: Re: [PATCH] page_pool: add a comment explaining the fragment counter usage
To:     Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc:     netdev@vger.kernel.org, Jesper Dangaard Brouer <hawk@kernel.org>,
        "David S. Miller" <davem@davemloft.net>,
        Eric Dumazet <edumazet@google.com>,
        Jakub Kicinski <kuba@kernel.org>,
        Paolo Abeni <pabeni@redhat.com>, linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

On Fri, Jan 27, 2023 at 2:16 AM Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> When reading the page_pool code the first impression is that keeping
> two separate counters, one being the page refcnt and the other being
> fragment pp_frag_count, is counter-intuitive.
>
> However without that fragment counter we don't know when to reliably
> destroy or sync the outstanding DMA mappings.  So let's add a comment
> explaining this part.
>
> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> ---
>  include/net/page_pool.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> index 813c93499f20..115dbce6d431 100644
> --- a/include/net/page_pool.h
> +++ b/include/net/page_pool.h
> @@ -277,6 +277,14 @@ void page_pool_put_defragged_page(struct page_pool *pool, struct page *page,
>                                   unsigned int dma_sync_size,
>                                   bool allow_direct);
>
> +/* pp_frag_count is our number of outstanding DMA maps.  We can't rely on the
> + * page refcnt for that as we don't know who might be holding page references
> + * and we can't reliably destroy or sync DMA mappings of the fragments.
> + *

This isn't quite right. Basically each frag is writable by the holder
of the frag. As such pp_frag_count represents the number of writers
who could still update the page either in the form of updating
skb->data or via DMA from the device.

> + * When pp_frag_count reaches 0 we can either recycle the page, if the page
> + * refcnt is 1, or return it back to the memory allocator and destroy any
> + * mappings we have.
> + */
>  static inline void page_pool_fragment_page(struct page *page, long nr)
>  {
>         atomic_long_set(&page->pp_frag_count, nr);

The rest of this looks good to me.