We have seen a couple of __alloc_pages() failures due to fragmentation,
there is plenty
of free memory but no large order pages available. I think the problem
is in
sock_alloc_send_pskb(), the gfp_mask includes __GFP_REPEAT but its never
used/passed to the page allocator. Shouldnt the gfp_mask be passed to
alloc_skb() ?
-------------------------------------------------------------------
struct sk_buff *sock_alloc_send_pskb()
{
unsigned int gfp_mask;
...
gfp_mask = sk->sk_allocation;
if (gfp_mask & __GFP_WAIT)
gfp_mask |= __GFP_REPEAT;
...
skb = alloc_skb(header_len, sk->sk_allocation);
--------------------------------------------------------------------
--- linux-2.6.18.noarch/net/core/sock.c.orig
+++ linux-2.6.18.noarch/net/core/sock.c
@@ -1154,7 +1154,7 @@ static struct sk_buff *sock_alloc_send_p
goto failure;
if (atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf) {
- skb = alloc_skb(header_len, sk->sk_allocation);
+ skb = alloc_skb(header_len, gfp_mask);
if (skb) {
int npages;
int i;
> --- linux-2.6.18.noarch/net/core/sock.c.orig
> +++ linux-2.6.18.noarch/net/core/sock.c
> @@ -1154,7 +1154,7 @@ static struct sk_buff *sock_alloc_send_p
> goto failure;
>
> if (atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf) {
> - skb = alloc_skb(header_len, sk->sk_allocation);
> + skb = alloc_skb(header_len, gfp_mask);
> if (skb) {
> int npages;
> int i;
Hi,
this is not actually right though... sk_allocation is very possible to
have a restricting mask compared to the one passed in (say "no highmem"
or even GFP_DMA) and you now discard this... probably better would be to
calculate a set of "transient" flags that you then or into the
sk_allocation mask at this time...
Greetings,
Arjan van de Ven
--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org
Arjan van de Ven wrote:
>>--- linux-2.6.18.noarch/net/core/sock.c.orig
>>+++ linux-2.6.18.noarch/net/core/sock.c
>>@@ -1154,7 +1154,7 @@ static struct sk_buff *sock_alloc_send_p
>> goto failure;
>>
>> if (atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf) {
>>- skb = alloc_skb(header_len, sk->sk_allocation);
>>+ skb = alloc_skb(header_len, gfp_mask);
>> if (skb) {
>> int npages;
>> int i;
>>
>>
>
>Hi,
>
>this is not actually right though... sk_allocation is very possible to
>have a restricting mask compared to the one passed in (say "no highmem"
>or even GFP_DMA) and you now discard this... probably better would be to
>calculate a set of "transient" flags that you then or into the
>sk_allocation mask at this time...
>
>Greetings,
> Arjan van de Ven
>
>
>
Hi Arjan. Right but this just includes __GFP_REPEAT in the mask so we can
defrag in __alloc_pages and only if GFP_WAIT was passed in origionally.
Larry
From: Larry Woodman <[email protected]>
Date: Fri, 03 Nov 2006 07:39:44 -0500
> Hi Arjan. Right but this just includes __GFP_REPEAT in the mask so we can
> defrag in __alloc_pages and only if GFP_WAIT was passed in origionally.
Indeed, quoting that small snippet of the patch was deceptive :-)
Arjan, gfp_mask is set to sk->sk_allocation, and we just add a bit
into it conditionally.