2005-11-18 00:42:00

by Yan Zheng

[permalink] [raw]
Subject: [DEBUG INFO]IPv6: sleeping function called from invalid context.

I get follow message when switch to single user mode and the kernel version is 2.6.15-rc1-git5.

Regards

Nov 18 08:26:23 localhost kernel: Debug: sleeping function called from invalid context at mm/slab.c:2472
Nov 18 08:26:23 localhost kernel: in_atomic():1, irqs_disabled():0
Nov 18 08:26:23 localhost kernel: [<c0149d5a>] kmem_cache_alloc+0x5a/0x70
Nov 18 08:26:23 localhost kernel: [<e0f47336>] inet6_dump_fib+0xb6/0x110 [ipv6]
Nov 18 08:26:23 localhost kernel: [<c02e129c>] netlink_dump+0x4c/0x1e0
Nov 18 08:26:23 localhost kernel: [<c02e150a>] netlink_dump_start+0xda/0x170
Nov 18 08:26:23 localhost kernel: [<c02d2d05>] rtnetlink_rcv_msg+0x1d5/0x250
Nov 18 08:26:23 localhost kernel: [<e0f47280>] inet6_dump_fib+0x0/0x110 [ipv6]
Nov 18 08:26:23 localhost kernel: [<c02d2b30>] rtnetlink_rcv_msg+0x0/0x250
Nov 18 08:26:23 localhost kernel: [<c02e17ed>] netlink_rcv_skb+0x4d/0x90
Nov 18 08:26:23 localhost kernel: [<c02d2b30>] rtnetlink_rcv_msg+0x0/0x250
Nov 18 08:26:23 localhost kernel: [<c02e1860>] netlink_run_queue+0x30/0xc0
Nov 18 08:26:23 localhost kernel: [<c02d2b03>] rtnetlink_rcv+0x23/0x50
Nov 18 08:26:23 localhost kernel: [<c02e1092>] netlink_data_ready+0x12/0x60
Nov 18 08:26:23 localhost kernel: [<c0335414>] _spin_unlock_irqrestore+0x14/0x30
Nov 18 08:26:23 localhost kernel: [<c02e0284>] netlink_sendskb+0x24/0x50
Nov 18 08:26:23 localhost kernel: [<c02e0d3c>] netlink_sendmsg+0x29c/0x350
Nov 18 08:26:23 localhost kernel: [<c02be551>] sock_sendmsg+0x111/0x150
Nov 18 08:26:23 localhost kernel: [<c01c56f0>] avc_lookup+0xc0/0x120
Nov 18 08:26:23 localhost kernel: [<c0131f10>] autoremove_wake_function+0x0/0x50
Nov 18 08:26:23 localhost kernel: [<c02bdfff>] move_addr_to_user+0x5f/0x70
Nov 18 08:26:23 localhost kernel: [<c02c0628>] sys_recvmsg+0x1b8/0x230
Nov 18 08:26:23 localhost kernel: [<c013ef94>] audit_sockaddr+0x54/0xc0
Nov 18 08:26:23 localhost kernel: [<c02bfded>] sys_sendto+0x10d/0x150
Nov 18 08:26:23 localhost kernel: [<c0141b6d>] filemap_nopage+0x30d/0x3e0
Nov 18 08:26:23 localhost kernel: [<c0145424>] __alloc_pages+0x64/0x310
Nov 18 08:26:23 localhost kernel: [<c013db8e>] audit_filter_syscall+0x4e/0x130
Nov 18 08:26:23 localhost kernel: [<c013db8e>] audit_filter_syscall+0x4e/0x130
Nov 18 08:26:23 localhost kernel: [<c02c0865>] sys_socketcall+0x1c5/0x2a0
Nov 18 08:26:23 localhost kernel: [<c0103261>] syscall_call+0x7/0xb
Nov 18 08:26:23 localhost kernel: Removing netfilter NETLINK layer.


2005-11-18 00:46:51

by YOSHIFUJI Hideaki

[permalink] [raw]
Subject: Re: [DEBUG INFO]IPv6: sleeping function called from invalid context.

In article <[email protected]> (at Fri, 18 Nov 2005 08:44:27 +0800), Yan Zheng <[email protected]> says:

> I get follow message when switch to single user mode and the kernel version is 2.6.15-rc1-git5.
:
> Nov 18 08:26:23 localhost kernel: Debug: sleeping function called from invalid context at mm/slab.c:2472
> Nov 18 08:26:23 localhost kernel: in_atomic():1, irqs_disabled():0
> Nov 18 08:26:23 localhost kernel: [<c0149d5a>] kmem_cache_alloc+0x5a/0x70
> Nov 18 08:26:23 localhost kernel: [<e0f47336>] inet6_dump_fib+0xb6/0x110 [ipv6]

I remember someone replaced GFP_ATOMIC with GFP_KERNEL...

--yoshfuji

2005-11-18 12:35:37

by Thomas Graf

[permalink] [raw]
Subject: Re: [DEBUG INFO]IPv6: sleeping function called from invalid context.

* YOSHIFUJI Hideaki / ?$B5HF#1QL@ <[email protected]> 2005-11-18 09:47
> In article <[email protected]> (at Fri, 18 Nov 2005 08:44:27 +0800), Yan Zheng <[email protected]> says:
>
> > I get follow message when switch to single user mode and the kernel version is 2.6.15-rc1-git5.
> :
> > Nov 18 08:26:23 localhost kernel: Debug: sleeping function called from invalid context at mm/slab.c:2472
> > Nov 18 08:26:23 localhost kernel: in_atomic():1, irqs_disabled():0
> > Nov 18 08:26:23 localhost kernel: [<c0149d5a>] kmem_cache_alloc+0x5a/0x70
> > Nov 18 08:26:23 localhost kernel: [<e0f47336>] inet6_dump_fib+0xb6/0x110 [ipv6]
>
> I remember someone replaced GFP_ATOMIC with GFP_KERNEL...

I did. I think it was right, why would an allocation be necessary on
the second call to inet6_dump_fib()? The walker allocated in process
context on the first call should be reused from cb->args[0].

2005-11-19 11:48:48

by Herbert Xu

[permalink] [raw]
Subject: Re: [DEBUG INFO]IPv6: sleeping function called from invalid context.

Thomas Graf <[email protected]> wrote:
>
> I did. I think it was right, why would an allocation be necessary on
> the second call to inet6_dump_fib()? The walker allocated in process
> context on the first call should be reused from cb->args[0].

Continued dumps are always called under spin lock (see netlink_dump).
So we need to use GFP_ATOMIC in dumpers.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2005-11-19 21:03:52

by Thomas Graf

[permalink] [raw]
Subject: Re: [DEBUG INFO]IPv6: sleeping function called from invalid context.

* Herbert Xu <[email protected]> 2005-11-19 22:48
> Thomas Graf <[email protected]> wrote:
> >
> > I did. I think it was right, why would an allocation be necessary on
> > the second call to inet6_dump_fib()? The walker allocated in process
> > context on the first call should be reused from cb->args[0].
>
> Continued dumps are always called under spin lock (see netlink_dump).
> So we need to use GFP_ATOMIC in dumpers.

The continued dumps wouldn't be the problem, the walker is allocated
on the initial dump call. It was a mistake though, nlk->cb_lock spin
lock is always held for cb->dump() even though it should only be
required during the nlk->cb != NULL check. netlink_dump_start()
guarantees to only allow one dumper per socket at a time.

2005-11-19 22:39:15

by Herbert Xu

[permalink] [raw]
Subject: Re: [DEBUG INFO]IPv6: sleeping function called from invalid context.

On Sat, Nov 19, 2005 at 10:04:11PM +0100, Thomas Graf wrote:
>
> The continued dumps wouldn't be the problem, the walker is allocated
> on the initial dump call. It was a mistake though, nlk->cb_lock spin
> lock is always held for cb->dump() even though it should only be
> required during the nlk->cb != NULL check. netlink_dump_start()
> guarantees to only allow one dumper per socket at a time.

You're certainly right that the initial dump is what's causing the
problem.

I think the spin lock is still required though because we also need
to guard against netlink_release which can occur at any time since
the packet processing could be occuring in a different thread from
the one that did the sendmsg.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt