2006-11-03 12:17:07

by Larry Woodman

[permalink] [raw]
Subject: __alloc_pages() failures reported due to fragmentation

We have seen a couple of __alloc_pages() failures due to fragmentation,
there is plenty
of free memory but no large order pages available. I think the problem
is in
sock_alloc_send_pskb(), the gfp_mask includes __GFP_REPEAT but its never
used/passed to the page allocator. Shouldnt the gfp_mask be passed to
alloc_skb() ?

-------------------------------------------------------------------
struct sk_buff *sock_alloc_send_pskb()
{
unsigned int gfp_mask;
...

gfp_mask = sk->sk_allocation;
if (gfp_mask & __GFP_WAIT)
gfp_mask |= __GFP_REPEAT;
...
skb = alloc_skb(header_len, sk->sk_allocation);
--------------------------------------------------------------------


--- linux-2.6.18.noarch/net/core/sock.c.orig
+++ linux-2.6.18.noarch/net/core/sock.c
@@ -1154,7 +1154,7 @@ static struct sk_buff *sock_alloc_send_p
goto failure;

if (atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf) {
- skb = alloc_skb(header_len, sk->sk_allocation);
+ skb = alloc_skb(header_len, gfp_mask);
if (skb) {
int npages;
int i;




2006-11-03 12:21:57

by Arjan van de Ven

[permalink] [raw]
Subject: Re: __alloc_pages() failures reported due to fragmentation


> --- linux-2.6.18.noarch/net/core/sock.c.orig
> +++ linux-2.6.18.noarch/net/core/sock.c
> @@ -1154,7 +1154,7 @@ static struct sk_buff *sock_alloc_send_p
> goto failure;
>
> if (atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf) {
> - skb = alloc_skb(header_len, sk->sk_allocation);
> + skb = alloc_skb(header_len, gfp_mask);
> if (skb) {
> int npages;
> int i;

Hi,

this is not actually right though... sk_allocation is very possible to
have a restricting mask compared to the one passed in (say "no highmem"
or even GFP_DMA) and you now discard this... probably better would be to
calculate a set of "transient" flags that you then or into the
sk_allocation mask at this time...

Greetings,
Arjan van de Ven

--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org

2006-11-03 12:48:17

by Larry Woodman

[permalink] [raw]
Subject: Re: __alloc_pages() failures reported due to fragmentation

Arjan van de Ven wrote:

>>--- linux-2.6.18.noarch/net/core/sock.c.orig
>>+++ linux-2.6.18.noarch/net/core/sock.c
>>@@ -1154,7 +1154,7 @@ static struct sk_buff *sock_alloc_send_p
>> goto failure;
>>
>> if (atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf) {
>>- skb = alloc_skb(header_len, sk->sk_allocation);
>>+ skb = alloc_skb(header_len, gfp_mask);
>> if (skb) {
>> int npages;
>> int i;
>>
>>
>
>Hi,
>
>this is not actually right though... sk_allocation is very possible to
>have a restricting mask compared to the one passed in (say "no highmem"
>or even GFP_DMA) and you now discard this... probably better would be to
>calculate a set of "transient" flags that you then or into the
>sk_allocation mask at this time...
>
>Greetings,
> Arjan van de Ven
>
>
>
Hi Arjan. Right but this just includes __GFP_REPEAT in the mask so we can
defrag in __alloc_pages and only if GFP_WAIT was passed in origionally.

Larry


2006-11-03 13:04:57

by David Miller

[permalink] [raw]
Subject: Re: __alloc_pages() failures reported due to fragmentation

From: Larry Woodman <[email protected]>
Date: Fri, 03 Nov 2006 07:39:44 -0500

> Hi Arjan. Right but this just includes __GFP_REPEAT in the mask so we can
> defrag in __alloc_pages and only if GFP_WAIT was passed in origionally.

Indeed, quoting that small snippet of the patch was deceptive :-)

Arjan, gfp_mask is set to sk->sk_allocation, and we just add a bit
into it conditionally.