2007-08-20 16:40:06

by Thomas Jarosch

[permalink] [raw]
Subject: Kernel oops during netfilter memory allocation

Hey there,

I'm currently debugging a kernel oops with kernel 2.6.21.7 that occurs
from time to time with our netfilter accounting module ipt_ACCOUNT
(http://www.intra2net.com/de/produkte/opensource/ipt_account/).

What basically happens is that the module allocates memory
inside an interrupt handler like many other netfilter modules do.
Except that it does it very often if you have a large, busy network.

To me it looks like I found a bug in the kernel memory handler.
I've written a minimalistic netfilter module ipt_CRASH that
just allocates and frees memory to demonstrate the problem.

Here's a screenshot of the oops:
http://img231.imageshack.us/img231/9064/kerneloopsiptcrashzb1.jpg

Another shot of it with a "tainted" kernel:
http://img529.imageshack.us/img529/6681/kerneloopsey1.jpg

I would have included the backtrace as text, but unfortunately
the USB serial console dies before it outputs anything useful.

You need a fast machine with lots of RAM to reproduce it, I used a 3 Ghz
machine with 3 GB of RAM. ipt_CRASH allocates and frees memory for every
packet, a packet generator like "sendip" might help:
http://www.earth.li/projectpurple/progs/sendip.html

The machine dies after 15-30 minutes or as soon as I try to reboot it.

Please CC: answers, I'm only on netfilter-devel.

Thanks in advance,
Thomas


Attachments:
(No filename) (1.30 kB)
crashit.sh (172.00 B)
libipt_CRASH.c (2.03 kB)
ipt_CRASH.c (3.03 kB)
ipt_CRASH.h (842.00 B)
Download all attachments

2007-08-20 19:31:17

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: Kernel oops during netfilter memory allocation

On Mon, Aug 20, 2007 at 06:08:08PM +0200, Thomas Jarosch wrote:
> I'm currently debugging a kernel oops with kernel 2.6.21.7 that occurs
> from time to time with our netfilter accounting module ipt_ACCOUNT
> (http://www.intra2net.com/de/produkte/opensource/ipt_account/).

kernel BUG at arch/i386/mm/highmem.c:38

> static unsigned int ipt_crash_target(struct sk_buff **pskb,
> const struct net_device *in,
> const struct net_device *out,
> unsigned int hooknum,
> #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,17)
> const struct xt_target *target,
> #endif
> #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,19)
> const void *targinfo)
> #else
> const void *targinfo,
> void *userinfo)
> #endif
> {
> char *data;
>
> spin_lock_bh(&ipt_crash_lock);
>
> if ((data = (char *)get_zeroed_page(GFP_ATOMIC)) == NULL) {

Try this.

commit b8c1c5da1520977cb55a358f20fc09567d40cad9
tree c762e6ad77297beed0978337ce2f5b0c50add739
parent 01e457cfcd5b6b6f18d0bb8cec0c5d43df56557e
author Andrew Morton <[email protected]> 1185303760 -0700
committer Linus Torvalds <[email protected]> 1185305099 -0700

slab: correctly handle __GFP_ZERO

Use the correct local variable when calling into the page allocator. Local
`flags' can have __GFP_ZERO set, which causes us to pass __GFP_ZERO into the
page allocator, possibly from illegal contexts. The page allocator will later
do prep_zero_page()->kmap_atomic(..., KM_USER0) from irq contexts and will
then go BUG.

Cc: Mike Galbraith <[email protected]>
Acked-by: Christoph Lameter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
diff --git a/mm/slab.c b/mm/slab.c
index bde271c..a684778 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2776,7 +2776,7 @@ static int cache_grow(struct kmem_cache *cachep,
* 'nodeid'.
*/
if (!objp)
- objp = kmem_getpages(cachep, flags, nodeid);
+ objp = kmem_getpages(cachep, local_flags, nodeid);
if (!objp)
goto failed;

2007-08-21 09:55:20

by Thomas Jarosch

[permalink] [raw]
Subject: Re: Kernel oops during netfilter memory allocation

On Monday, 20. August 2007, Alexey Dobriyan wrote:
> kernel BUG at arch/i386/mm/highmem.c:38
>
> Try this.
>
> commit b8c1c5da1520977cb55a358f20fc09567d40cad9
> tree c762e6ad77297beed0978337ce2f5b0c50add739
> parent 01e457cfcd5b6b6f18d0bb8cec0c5d43df56557e
> author Andrew Morton <[email protected]> 1185303760 -0700
> committer Linus Torvalds <[email protected]> 1185305099
> -0700
>
> slab: correctly handle __GFP_ZERO
>
> Use the correct local variable when calling into the page allocator. Local
> `flags' can have __GFP_ZERO set, which causes us to pass __GFP_ZERO into
> the page allocator, possibly from illegal contexts. The page allocator
> will later do prep_zero_page()->kmap_atomic(..., KM_USER0) from irq
> contexts and will then go BUG.

Thanks for the patch. Unfortunately it didn't cure the problem.
After some searching I found one of those rare serial port brackets
in the attic, so I am now able to get a dump from the serial console.

Here's the output with the patch applied:

------------[ cut here ]------------
kernel BUG at arch/i386/mm/highmem.c:38!
invalid opcode: 0000 [#1]
Modules linked in: ipt_CRASH(U) deflate zlib_deflate twofish twofish_common
serpent blowfish des cbc ecb blkcipher aes xcbc sha256 crypto_null xfrm_user
xfrm4_tunnel tunnel4 ipcomp esp4 ah4 af_key dummy tg3 e1000 iptable_mangle
ipt_iprange xt_CONNMARK ipt_REDIRECT ipt_MASQUERADE iptable_nat xt_conntrack
xt_connmark ipt_LOG xt_limit xt_TCPMSS ipt_REJECT ipt_recent
nf_conntrack_ipv4 xt_state ipt_ACCOUNT xt_tcpudp xt_condition xt_policy
xt_multiport iptable_filter ip_tables x_tables nf_nat_tftp nf_nat_sip
nf_nat_irc nf_nat_pptp nf_nat_proto_gre nf_nat_ftp nf_nat nf_conntrack_tftp
nf_conntrack_socks nf_conntrack_sip nf_conntrack_irc nf_conntrack_pptp
nf_conntrack_proto_gre nf_conntrack_ftp nf_conntrack nfnetlink ext3 jbd
mbcache reiserfs sr_mod cdrom sg ata_piix libata generic
CPU: 0
EIP: 0060:[<c0114425>] Not tainted VLI
EFLAGS: 00010286 (2.6.21-1.i2n #4)
EIP is at kmap_atomic+0x24/0x88
eax: 0000000c ebx: c0002ee4 ecx: c16fa5e0 edx: 00000003
esi: 00000003 edi: 00000001 ebp: c16fa5e0 esp: f7723a98
ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
Process pop3.monitor (pid: 3015, ti=f7722000 task=dfdc3530 task.ti=f7722000)
Stack: c16fa5e0 00000000 c013c01b 00000001 00000044 00000044 00000001 00000000
00028220 c036dea4 c036dc04 00000000 00000001 f7723b78 00000001 00000286
00008220 00008220 c036dea0 dfdc3530 c013c0fa 00000044 00000002 f79da4b4
Call Trace:
[<c013c01b>] get_page_from_freelist+0x1e6/0x273
[<c013c0fa>] __alloc_pages+0x52/0x286
[<f8890194>] nf_conntrack_tcp_update+0x1068/0x10e8 [nf_conntrack]
[<c013c394>] get_zeroed_page+0x2c/0x3a
[<f89f8000>] ipt_crash_target+0x0/0x48 [ipt_CRASH]
[<f89f8012>] ipt_crash_target+0x12/0x48 [ipt_CRASH]
[<f893b360>] ipt_do_table+0x27b/0x2d3 [ip_tables]
[<c029085c>] nf_iterate+0x38/0x6a
[<c0290986>] nf_hook_slow+0x4d/0xb5
[<c0295336>] ip_local_deliver_finish+0x0/0x166
[<c0295983>] ip_local_deliver+0x6f/0x1e0
[<c0295336>] ip_local_deliver_finish+0x0/0x166
[<c02958e7>] ip_rcv+0x3ad/0x3da
[<c0295140>] ip_rcv_finish+0x0/0x1f6
[<c027f42c>] netif_receive_skb+0x217/0x260
[<c027ba46>] __alloc_skb+0x49/0xf7
[<f89b4f8c>] e1000_read_pci_cfg+0x5eb/0x1903 [e1000]
[<f89b4c42>] e1000_read_pci_cfg+0x2a1/0x1903 [e1000]
[<f89b3f8f>] e1000_reinit_locked+0x1aba/0x24cc [e1000]
[<c013c0fa>] __alloc_pages+0x52/0x286
[<f89f8000>] ipt_crash_target+0x0/0x48 [ipt_CRASH]
[<c0280a80>] net_rx_action+0x52/0x10e
[<c011ae94>] __do_softirq+0x35/0x75
[<c011aef6>] do_softirq+0x22/0x26
[<c01061ef>] do_IRQ+0x5c/0x71
[<c010460f>] common_interrupt+0x23/0x28
[<c01a31be>] __copy_to_user_ll+0xcb/0xd6
[<c01384f4>] file_read_actor+0x6b/0xca
[<c0138c48>] do_generic_mapping_read+0x172/0x40a
[<c013a90a>] generic_file_aio_read+0x150/0x182
[<c0138489>] file_read_actor+0x0/0xca
[<c0157b18>] do_path_lookup+0x14a/0x19a
[<c015156b>] get_empty_filp+0x4f/0xcc
[<c0150613>] do_sync_read+0xc7/0x10a
[<c0125574>] autoremove_wake_function+0x0/0x35
[<c0107979>] sys_mmap2+0x6f/0xb7
[<c015054c>] do_sync_read+0x0/0x10a
[<c0150d76>] vfs_read+0x88/0x10a
[<c0151098>] sys_read+0x41/0x67
[<c0103c66>] syscall_call+0x7/0xb
=======================
Code: c8 e9 f9 c4 02 00 c3 56 89 c1 53 89 d6 89 e0 25 00 e0 ff ff ff 40 14 8b
1d b4 f7 3e c0 8d 04 95 00 00 00 00 29 c3 83 3b 00 74 04 <0f> 0b eb fe 8b 01
c1 e8 1e 69 c0 44 01 00 00 05 c0 da 36 c0 8b
EIP: [<c0114425>] kmap_atomic+0x24/0x88 SS:ESP 0068:f7723a98
Kernel panic - not syncing: Fatal exception in interrupt
------------------------------------


I've replaced the Intel e1000 card with a bulk Realtek 8139 card just to make
sure it's not related to the e1000 driver. Here's the backtrace:

------------[ cut here ]------------
kernel BUG at arch/i386/mm/highmem.c:38!
invalid opcode: 0000 [#1]
Modules linked in: ipt_CRASH(U) iptable_filter ip_tables x_tables 8139too mii
deflate zlib_deflate twofish twofish_common serpent blowfish des cbc ecb blkc
CPU: 0
EIP: 0060:[<c0114425>] Not tainted VLI
EFLAGS: 00010286 (2.6.21-1.i2n #4)
EIP is at kmap_atomic+0x24/0x88
eax: 0000000c ebx: c0002ee4 ecx: c167b2c0 edx: 00000003
esi: 00000003 edi: 00000001 ebp: c167b2c0 esp: f7a73c74
ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
Process imap.monitor (pid: 3181, ti=f7a72000 task=dfc8ea30 task.ti=f7a72000)
Stack: c167b2c0 00000000 c013c01b 00000001 00000044 00000002 00000001 00000000
00028220 c036dea4 c036dc04 00000000 00000001 00000000 00000001 dfdfde40
00008220 00008220 c036dea0 dfc8ea30 c013c0fa 00000044 c013c0fa 00000044
Call Trace:
[<c013c01b>] get_page_from_freelist+0x1e6/0x273
[<c013c0fa>] __alloc_pages+0x52/0x286
[<c013c0fa>] __alloc_pages+0x52/0x286
[<c013a66b>] filemap_nopage+0x156/0x2a5
[<c013c394>] get_zeroed_page+0x2c/0x3a
[<f8933000>] ipt_crash_target+0x0/0x48 [ipt_CRASH]
[<f8933012>] ipt_crash_target+0x12/0x48 [ipt_CRASH]
[<f893b360>] ipt_do_table+0x27b/0x2d3 [ip_tables]
[<c029085c>] nf_iterate+0x38/0x6a
[<c0290986>] nf_hook_slow+0x4d/0xb5
[<c0295336>] ip_local_deliver_finish+0x0/0x166
[<c0295983>] ip_local_deliver+0x6f/0x1e0
[<c0295336>] ip_local_deliver_finish+0x0/0x166
[<c02958e7>] ip_rcv+0x3ad/0x3da
[<c027f42c>] netif_receive_skb+0x217/0x260
[<c027ba46>] __alloc_skb+0x49/0xf7
[<f8a4540c>] init_module+0xc140c/0xc22b0 [8139too]
[<c0280a80>] net_rx_action+0x52/0x10e
[<c011ae94>] __do_softirq+0x35/0x75
[<c011aef6>] do_softirq+0x22/0x26
[<c01061ef>] do_IRQ+0x5c/0x71
[<c010460f>] common_interrupt+0x23/0x28
[<c0142282>] __handle_mm_fault+0x1aa/0x7a6
[<c0113922>] do_page_fault+0x20d/0x4d3
[<c0113715>] do_page_fault+0x0/0x4d3
[<c02ca89c>] error_code+0x74/0x7c
=======================
Code: c8 e9 f9 c4 02 00 c3 56 89 c1 53 89 d6 89 e0 25 00 e0 ff ff ff 40 14 8b
1d b4 f7 3e c0 8d 04 95 00 00 00 00 29 c3 83 3b 00 74 04 <0f> 0b eb fe 8b 01
EIP: [<c0114425>] kmap_atomic+0x24/0x88 SS:ESP 0068:f7a73c74
Kernel panic - not syncing: Fatal exception in interrupt
------------------------------------

I can't trigger the bug by force if I boot the system with mem=512m.

Anything I could try?

Thomas

2007-08-22 15:09:26

by Thomas Jarosch

[permalink] [raw]
Subject: Re: Kernel oops during netfilter memory allocation

Hello,

> > kernel BUG at arch/i386/mm/highmem.c:38
> >
> > Try this.

I just tried kernel 2.6.22.4 and 2.6.23-rc3. Using 2.6.23-rc3 vanilla,
the box survives only 3 seconds with ipt_CRASH. Here's the backtrace,
it's only slightly different:

------------[ cut here ]------------
kernel BUG at arch/i386/mm/highmem.c:38!
invalid opcode: 0000 [#1]
Modules linked in: ipt_CRASH iptable_filter ip_tables x_tables deflate
zlib_deflate twofish twofish_common serpent blowfish des cbc ecb blkcipher
aes xcbcd
CPU: 0
EIP: 0060:[<c011401f>] Not tainted VLI
EFLAGS: 00010286 (2.6.23-rc3 #1)
EIP is at kmap_atomic_prot+0x24/0x83
eax: 0000000c ebx: c16eba60 ecx: c0002edc edx: 00000003
esi: 00000163 edi: 00000001 ebp: c16eba60 esp: f75f1b9c
ds: 007b es: 007b fs: 0000 gs: 0000 ss: 0068
Process info_iponline (pid: 3469, ti=f75f0000 task=f7269ab0 task.ti=f75f0000)
Stack: c16eba60 00000000 c013c695 00000001 00000044 40000000 00000001 00000000
00028020 c0327a3c c0327648 00000000 00000001 00000000 00000001 00000000
00008020 00008020 00000000 c0327a38 c013c778 00000044 63d210ac 00000000
Call Trace:
[<c013c695>] get_page_from_freelist+0x1ed/0x284
[<c013c778>] __alloc_pages+0x4c/0x282
[<c013ca31>] get_zeroed_page+0x3c/0x4a
[<f894e000>] ipt_crash_target+0x0/0x48 [ipt_CRASH]
[<f894e012>] ipt_crash_target+0x12/0x48 [ipt_CRASH]
[<f8958331>] ipt_do_table+0x26d/0x2c8 [ip_tables]
[<c0258484>] nf_iterate+0x38/0x6a
[<c02585ed>] nf_hook_slow+0x4d/0xb5
[<c025d1b3>] ip_local_deliver_finish+0x0/0x16b
[<c025d85c>] ip_local_deliver+0x72/0x1ea
[<c025d1b3>] ip_local_deliver_finish+0x0/0x16b
[<c025d7bd>] ip_rcv+0x3fb/0x428
[<c01287c4>] getnstimeofday+0x2b/0xaf
[<c0127776>] ktime_get_real+0xf/0x2b
[<c024540c>] netif_receive_skb+0x219/0x269
[<f8998fa0>] e1000_clean_rx_irq+0x35a/0x42a [e1000]
[<f8998c46>] e1000_clean_rx_irq+0x0/0x42a [e1000]
[<f8997f88>] e1000_clean+0x304/0x4ae [e1000]
[<c014f745>] get_unused_fd_flags+0x42/0xaa
[<c0161327>] mntput_no_expire+0x11/0x47
[<c01580c1>] __link_path_walk+0x83f/0x9de
[<c01394cc>] do_generic_mapping_read+0x3cd/0x3d5
[<c02471ff>] net_rx_action+0x52/0xe9
[<c011b83c>] __do_softirq+0x35/0x75
[<c011b89e>] do_softirq+0x22/0x26
[<c0105e8c>] do_IRQ+0x58/0x6c
[<c0138e42>] find_lock_page+0x12/0x62
[<c0104573>] common_interrupt+0x23/0x28
[<c0141996>] __do_fault+0x136/0x2d1
[<c0142eac>] handle_mm_fault+0x2c5/0x571
[<c0145665>] vma_merge+0x120/0x178
[<c0113560>] do_page_fault+0x212/0x58a
[<c011334e>] do_page_fault+0x0/0x58a
[<c0292f2a>] error_code+0x6a/0x70
[<c0290000>] tpacket_rcv+0x142/0x38e
=======================
Code: 03 05 00 eb 38 c0 c3 56 89 ce 53 89 c3 89 e0 25 00 e0 ff ff ff 40 14 8b
0d f0 17 38 c0 8d 04 95 00 00 00 00 29 c1 83 39 00 74 04 <0f> 0b eb fe 8b 03
EIP: [<c011401f>] kmap_atomic_prot+0x24/0x83 SS:ESP 0068:f75f1b9c
Kernel panic - not syncing: Fatal exception in interrupt
------------------------------------

Disabling highmem support seems to work as a temporary solution.
Or does it only hide the symptoms of something really going wrong?

Any help is greatly appreciated.

Thomas

2007-08-22 17:55:42

by Sami Farin

[permalink] [raw]
Subject: Re: Kernel oops during netfilter memory allocation

On Wed, Aug 22, 2007 at 17:09:04 +0200, Thomas Jarosch wrote:
> Hello,
>
> > > kernel BUG at arch/i386/mm/highmem.c:38
> > >
> > > Try this.
>
> I just tried kernel 2.6.22.4 and 2.6.23-rc3. Using 2.6.23-rc3 vanilla,
> the box survives only 3 seconds with ipt_CRASH. Here's the backtrace,
> it's only slightly different:
>
> ------------[ cut here ]------------
> kernel BUG at arch/i386/mm/highmem.c:38!
> invalid opcode: 0000 [#1]
> Modules linked in: ipt_CRASH iptable_filter ip_tables x_tables deflate
> zlib_deflate twofish twofish_common serpent blowfish des cbc ecb blkcipher
> aes xcbcd

What are you using crypto stuff for?
Hopefully you are not calling crypto functions from
IRQ context.

--
Do what you love because life is too short for anything else.

2007-08-23 12:26:05

by Thomas Jarosch

[permalink] [raw]
Subject: Re: Kernel oops during netfilter memory allocation

Hello together,

I've found an easier way to demonstrate the bug
and filed a report at bugzilla.kernel.org:

http://bugzilla.kernel.org/show_bug.cgi?id=8928

Hope there will be a quick resolution for this issue
as the boxes crashing serve 800 users and more.

Best regards,
Thomas