2021-06-29 13:52:23

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v3] mm/page_alloc: Further fix __alloc_pages_bulk() return value

The author of commit b3b64ebd3822 ("mm/page_alloc: do bulk array
bounds check after checking populated elements") was possibly
confused by the mixture of return values throughout the function.

The API contract is clear that the function "Returns the number of
pages on the list or array." It does not list zero as a unique
return value with a special meaning. Therefore zero is a plausible
return value only if @nr_pages is zero or less.

Clean up the return logic to make it clear that the returned value
is always the total number of pages in the array/list, not the
number of pages that were allocated during this call.

The only change in behavior with this patch is the value returned
if prepare_alloc_pages() fails. To match the API contract, the
number of pages currently in the array/list is returned in this
case.

The call site in __page_pool_alloc_pages_slow() also seems to be
confused on this matter. It should be attended to by someone who
is familiar with that code.

Signed-off-by: Chuck Lever <[email protected]>
---
mm/page_alloc.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e7af86e1a312..49eb2e134f9d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5059,7 +5059,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
int nr_populated = 0;

if (unlikely(nr_pages <= 0))
- return 0;
+ goto out;

/*
* Skip populated array elements to determine if any pages need
@@ -5070,7 +5070,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,

/* Already populated array? */
if (unlikely(page_array && nr_pages - nr_populated == 0))
- return nr_populated;
+ goto out;

/* Use the single page allocator for one page. */
if (nr_pages - nr_populated == 1)
@@ -5080,7 +5080,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
gfp &= gfp_allowed_mask;
alloc_gfp = gfp;
if (!prepare_alloc_pages(gfp, 0, preferred_nid, nodemask, &ac, &alloc_gfp, &alloc_flags))
- return 0;
+ goto out;
gfp = alloc_gfp;

/* Find an allowed local zone that meets the low watermark. */
@@ -5153,6 +5153,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,

local_irq_restore(flags);

+out:
return nr_populated;

failed_irq:
@@ -5168,7 +5169,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
nr_populated++;
}

- return nr_populated;
+ goto out;
}
EXPORT_SYMBOL_GPL(__alloc_pages_bulk);




2021-06-29 15:49:04

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH v3] mm/page_alloc: Further fix __alloc_pages_bulk() return value

On Tue, Jun 29, 2021 at 09:48:15AM -0400, Chuck Lever wrote:
> The author of commit b3b64ebd3822 ("mm/page_alloc: do bulk array
> bounds check after checking populated elements") was possibly
> confused by the mixture of return values throughout the function.
>
> The API contract is clear that the function "Returns the number of
> pages on the list or array." It does not list zero as a unique
> return value with a special meaning. Therefore zero is a plausible
> return value only if @nr_pages is zero or less.
>
> Clean up the return logic to make it clear that the returned value
> is always the total number of pages in the array/list, not the
> number of pages that were allocated during this call.
>
> The only change in behavior with this patch is the value returned
> if prepare_alloc_pages() fails. To match the API contract, the
> number of pages currently in the array/list is returned in this
> case.
>
> The call site in __page_pool_alloc_pages_slow() also seems to be
> confused on this matter. It should be attended to by someone who
> is familiar with that code.
>
> Signed-off-by: Chuck Lever <[email protected]>

Acked-by: Mel Gorman <[email protected]>

--
Mel Gorman
SUSE Labs

2021-06-29 16:04:43

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH v3] mm/page_alloc: Further fix __alloc_pages_bulk() return value

On 29/06/2021 15.48, Chuck Lever wrote:

> The call site in __page_pool_alloc_pages_slow() also seems to be
> confused on this matter. It should be attended to by someone who
> is familiar with that code.

I don't think we need a fix for __page_pool_alloc_pages_slow(), as the
array is guaranteed to be empty.

But a fix would look like this:

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index c137ce308c27..1b04538a3da3 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -245,22 +245,23 @@ static struct page
*__page_pool_alloc_pages_slow(struct page_pool *pool,
        if (unlikely(pp_order))
                return __page_pool_alloc_page_order(pool, gfp);

        /* Unnecessary as alloc cache is empty, but guarantees zero
count */
-       if (unlikely(pool->alloc.count > 0))
+       if (unlikely(pool->alloc.count > 0))   // ^^^^^^^^^^^^^^^^^^^^^^
                return pool->alloc.cache[--pool->alloc.count];

        /* Mark empty alloc.cache slots "empty" for
alloc_pages_bulk_array */
        memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);

+       /* bulk API ret value also count existing pages, but array is
empty */
        nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache);
        if (unlikely(!nr_pages))
                return NULL;

        /* Pages have been filled into alloc.cache array, but count is
zero and
         * page element have not been (possibly) DMA mapped.
         */
-       for (i = 0; i < nr_pages; i++) {
+       for (i = pool->alloc.count; i < nr_pages; i++) {
                page = pool->alloc.cache[i];
                if ((pp_flags & PP_FLAG_DMA_MAP) &&
                    unlikely(!page_pool_dma_map(pool, page))) {
                        put_page(page);

2021-06-29 16:09:00

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH v3] mm/page_alloc: Further fix __alloc_pages_bulk() return value


On 29/06/2021 17.33, Mel Gorman wrote:
> On Tue, Jun 29, 2021 at 09:48:15AM -0400, Chuck Lever wrote:
>> The author of commit b3b64ebd3822 ("mm/page_alloc: do bulk array
>> bounds check after checking populated elements") was possibly
>> confused by the mixture of return values throughout the function.
>>
>> The API contract is clear that the function "Returns the number of
>> pages on the list or array." It does not list zero as a unique
>> return value with a special meaning. Therefore zero is a plausible
>> return value only if @nr_pages is zero or less.
>>
>> Clean up the return logic to make it clear that the returned value
>> is always the total number of pages in the array/list, not the
>> number of pages that were allocated during this call.
>>
>> The only change in behavior with this patch is the value returned
>> if prepare_alloc_pages() fails. To match the API contract, the
>> number of pages currently in the array/list is returned in this
>> case.
>>
>> The call site in __page_pool_alloc_pages_slow() also seems to be
>> confused on this matter. It should be attended to by someone who
>> is familiar with that code.
>>
>> Signed-off-by: Chuck Lever <[email protected]>
> Acked-by: Mel Gorman <[email protected]>
>

Acked-by: Jesper Dangaard Brouer <[email protected]>

2021-06-29 17:02:20

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v3] mm/page_alloc: Further fix __alloc_pages_bulk() return value



> On Jun 29, 2021, at 12:01 PM, Jesper Dangaard Brouer <[email protected]> wrote:
>
> On 29/06/2021 15.48, Chuck Lever wrote:
>
>> The call site in __page_pool_alloc_pages_slow() also seems to be
>> confused on this matter. It should be attended to by someone who
>> is familiar with that code.
>
> I don't think we need a fix for __page_pool_alloc_pages_slow(), as the array is guaranteed to be empty.
>
> But a fix would look like this:
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index c137ce308c27..1b04538a3da3 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -245,22 +245,23 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
> if (unlikely(pp_order))
> return __page_pool_alloc_page_order(pool, gfp);
>
> /* Unnecessary as alloc cache is empty, but guarantees zero count */
> - if (unlikely(pool->alloc.count > 0))
> + if (unlikely(pool->alloc.count > 0)) // ^^^^^^^^^^^^^^^^^^^^^^
> return pool->alloc.cache[--pool->alloc.count];
>
> /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */
> memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
>
> + /* bulk API ret value also count existing pages, but array is empty */
> nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache);
> if (unlikely(!nr_pages))
> return NULL;

Thanks, this check makes sense to me now.


> /* Pages have been filled into alloc.cache array, but count is zero and
> * page element have not been (possibly) DMA mapped.
> */
> - for (i = 0; i < nr_pages; i++) {
> + for (i = pool->alloc.count; i < nr_pages; i++) {
> page = pool->alloc.cache[i];
> if ((pp_flags & PP_FLAG_DMA_MAP) &&
> unlikely(!page_pool_dma_map(pool, page))) {
> put_page(page);
>

--
Chuck Lever



2021-06-30 07:08:23

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH v3] mm/page_alloc: Further fix __alloc_pages_bulk() return value

On Tue, Jun 29, 2021 at 06:01:12PM +0200, Jesper Dangaard Brouer wrote:
> On 29/06/2021 15.48, Chuck Lever wrote:
>
> > The call site in __page_pool_alloc_pages_slow() also seems to be
> > confused on this matter. It should be attended to by someone who
> > is familiar with that code.
>
> I don't think we need a fix for __page_pool_alloc_pages_slow(), as the array
> is guaranteed to be empty.
>
> But a fix would look like this:
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index c137ce308c27..1b04538a3da3 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -245,22 +245,23 @@ static struct page
> *__page_pool_alloc_pages_slow(struct page_pool *pool,
> ??????? if (unlikely(pp_order))
> ??????????????? return __page_pool_alloc_page_order(pool, gfp);
>
> ??????? /* Unnecessary as alloc cache is empty, but guarantees zero count */
> -?????? if (unlikely(pool->alloc.count > 0))
> +?????? if (unlikely(pool->alloc.count > 0))?? // ^^^^^^^^^^^^^^^^^^^^^^
> ??????????????? return pool->alloc.cache[--pool->alloc.count];
>
> ??????? /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array
> */
> ??????? memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
>
> +?????? /* bulk API ret value also count existing pages, but array is empty
> */
> ??????? nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache);
> ??????? if (unlikely(!nr_pages))
> ??????????????? return NULL;
>
> ??????? /* Pages have been filled into alloc.cache array, but count is zero
> and
> ???????? * page element have not been (possibly) DMA mapped.
> ???????? */
> -?????? for (i = 0; i < nr_pages; i++) {
> +?????? for (i = pool->alloc.count; i < nr_pages; i++) {

That last part would break as the loop is updating pool->alloc_count.
Just setting pool->alloc_count = nr_pages would break if PP_FLAG_DMA_MAP
was set and page_pool_dma_map failed. Right?

--
Mel Gorman
SUSE Labs

2021-06-30 11:25:10

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH v3] mm/page_alloc: Further fix __alloc_pages_bulk() return value


On 30/06/2021 08.58, Mel Gorman wrote:
> On Tue, Jun 29, 2021 at 06:01:12PM +0200, Jesper Dangaard Brouer wrote:
>> On 29/06/2021 15.48, Chuck Lever wrote:
>>
>>> The call site in __page_pool_alloc_pages_slow() also seems to be
>>> confused on this matter. It should be attended to by someone who
>>> is familiar with that code.
>> I don't think we need a fix for __page_pool_alloc_pages_slow(), as the array
>> is guaranteed to be empty.
>>
>> But a fix would look like this:
>>
>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
>> index c137ce308c27..1b04538a3da3 100644
>> --- a/net/core/page_pool.c
>> +++ b/net/core/page_pool.c
>> @@ -245,22 +245,23 @@ static struct page
>> *__page_pool_alloc_pages_slow(struct page_pool *pool,
>>         if (unlikely(pp_order))
>>                 return __page_pool_alloc_page_order(pool, gfp);
>>
>>         /* Unnecessary as alloc cache is empty, but guarantees zero count */
>> -       if (unlikely(pool->alloc.count > 0))
>> +       if (unlikely(pool->alloc.count > 0))   // ^^^^^^^^^^^^^^^^^^^^^^
>>                 return pool->alloc.cache[--pool->alloc.count];
>>
>>         /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array
>> */
>>         memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
>>
>> +       /* bulk API ret value also count existing pages, but array is empty
>> */
>>         nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache);
>>         if (unlikely(!nr_pages))
>>                 return NULL;
>>
>>         /* Pages have been filled into alloc.cache array, but count is zero
>> and
>>          * page element have not been (possibly) DMA mapped.
>>          */
>> -       for (i = 0; i < nr_pages; i++) {
>> +       for (i = pool->alloc.count; i < nr_pages; i++) {
> That last part would break as the loop is updating pool->alloc_count.
The last part "i = pool->alloc.count" probably is a bad idea.
> Just setting pool->alloc_count = nr_pages would break if PP_FLAG_DMA_MAP
> was set and page_pool_dma_map failed. Right?

Yes, this loop is calling page_pool_dma_map(), and if that fails we
don't advance pool->alloc_count.

--Jesper

2021-06-30 12:07:53

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH v3] mm/page_alloc: Further fix __alloc_pages_bulk() return value

On Wed, Jun 30, 2021 at 01:22:24PM +0200, Jesper Dangaard Brouer wrote:
>
> On 30/06/2021 08.58, Mel Gorman wrote:
> > On Tue, Jun 29, 2021 at 06:01:12PM +0200, Jesper Dangaard Brouer wrote:
> > > On 29/06/2021 15.48, Chuck Lever wrote:
> > >
> > > > The call site in __page_pool_alloc_pages_slow() also seems to be
> > > > confused on this matter. It should be attended to by someone who
> > > > is familiar with that code.
> > > I don't think we need a fix for __page_pool_alloc_pages_slow(), as the array
> > > is guaranteed to be empty.
> > >
> > > But a fix would look like this:
> > >
> > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > index c137ce308c27..1b04538a3da3 100644
> > > --- a/net/core/page_pool.c
> > > +++ b/net/core/page_pool.c
> > > @@ -245,22 +245,23 @@ static struct page
> > > *__page_pool_alloc_pages_slow(struct page_pool *pool,
> > > ??????? if (unlikely(pp_order))
> > > ??????????????? return __page_pool_alloc_page_order(pool, gfp);
> > >
> > > ??????? /* Unnecessary as alloc cache is empty, but guarantees zero count */
> > > -?????? if (unlikely(pool->alloc.count > 0))
> > > +?????? if (unlikely(pool->alloc.count > 0))?? // ^^^^^^^^^^^^^^^^^^^^^^
> > > ??????????????? return pool->alloc.cache[--pool->alloc.count];
> > >
> > > ??????? /* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array
> > > */
> > > ??????? memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
> > >
> > > +?????? /* bulk API ret value also count existing pages, but array is empty
> > > */
> > > ??????? nr_pages = alloc_pages_bulk_array(gfp, bulk, pool->alloc.cache);
> > > ??????? if (unlikely(!nr_pages))
> > > ??????????????? return NULL;
> > >
> > > ??????? /* Pages have been filled into alloc.cache array, but count is zero
> > > and
> > > ???????? * page element have not been (possibly) DMA mapped.
> > > ???????? */
> > > -?????? for (i = 0; i < nr_pages; i++) {
> > > +?????? for (i = pool->alloc.count; i < nr_pages; i++) {
> > That last part would break as the loop is updating pool->alloc_count.
>
> The last part "i = pool->alloc.count" probably is a bad idea.

It is because it confuses the context, either alloc.count is guaranteed
zero or it's not. Initialised as zero, it's fairly clear that it's a)
always zero and b) the way i and alloc.count works mean that the array
element pointing to a freed page is either ignored or overwritten by a
valid page pointer in a later loop iteration.

--
Mel Gorman
SUSE Labs