2010-01-28 06:25:24

by Jon Masters

[permalink] [raw]
Subject: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

Folks,

A number of people seem to have reported this crash in various forms,
but I have yet to see a solution, and can reproduce on 2.6.33-rc5 this
evening so I know it's still present in the latest upstream kernels too.
Userspace is Fedora 12, and this happens on both all recent F12 kernels
(sporadic in 2.6.31 until recently, solidly reproducible on 2.6.32) and
upstream 2.6.32, and 2.6.33-rc5 also - hard to find a "known good".

The problem happens when using netfilter with KVM (problem does not
occur without the firewall loaded, for example) and will occur within a
few minutes of attempting to start or stop a guest that is connecting to
the network - the easiest way to reproduce so far is simply to start up
a bunch of Fedora guests and have them do a "yum update" cycle.

All of the crashes appear similar to the following (2.6.33-rc5):

general protection fault: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 6
Pid: 2982, comm: qemu-kvm Not tainted 2.6.33-rc5 #2 0F9382/Precision
WorkStation 490
RIP: 0010:[<ffffffff813b4115>] [<ffffffff813b4115>] destroy_conntrack
+0x82/0x114
RSP: 0018:ffff880028383c48 EFLAGS: 00010202
RAX: 0000000080000001 RBX: ffffffff81af33a0 RCX: 0000000000007530
RDX: dead000000200200 RSI: 0000000000000011 RDI: ffffffff81af33a0
RBP: ffff880028383c58 R08: ffff8802171b14d0 R09: 000000000000000a
R10: 00000040283957c0 R11: ffff8800283838a8 R12: ffffffff81ddbce0
R13: ffffffffa0281389 R14: 0000000000000000 R15: ffff88021140f430
FS: 00007fc17b7d2780(0000) GS:ffff880028380000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fc12c038000 CR3: 00000001db1bb000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-kvm (pid: 2982, threadinfo ffff8801dab40000, task
ffff8801dab38000)
Stack:
ffff88021140f400 ffff88021360e410 ffff880028383c68 ffffffff813b2016
<0> ffff880028383c88 ffffffff8138dbc3 ffff880028383c88 ffff88021140f400
<0> ffff880028383ca8 ffffffff8138d925 0000000300000000 ffff88021140f400
Call Trace:
<IRQ>
[<ffffffff813b2016>] nf_conntrack_destroy+0x1b/0x1d
[<ffffffff8138dbc3>] skb_release_head_state+0x77/0xb9
[<ffffffff8138d925>] __kfree_skb+0x16/0x82
[<ffffffff8138da2a>] kfree_skb+0x6a/0x73
[<ffffffffa0281389>] ip6_mc_input+0x214/0x221 [ipv6]
[<ffffffffa02813bd>] ip6_rcv_finish+0x27/0x2b [ipv6]
[<ffffffffa02816c7>] ipv6_rcv+0x306/0x33f [ipv6]
[<ffffffff813b2193>] ? nf_hook_slow+0x6a/0xcb
[<ffffffff81395593>] ? netif_receive_skb+0x0/0x3c6
[<ffffffff81395934>] netif_receive_skb+0x3a1/0x3c6
[<ffffffffa02ebae6>] br_handle_frame_finish+0x104/0x13c [bridge]
[<ffffffffa02ebcaf>] br_handle_frame+0x191/0x1aa [bridge]
[<ffffffff813958a0>] netif_receive_skb+0x30d/0x3c6
[<ffffffff813959e3>] process_backlog+0x8a/0xc3
[<ffffffff81395fd8>] net_rx_action+0x78/0x17e
[<ffffffff81052fda>] __do_softirq+0xe5/0x1a6
[<ffffffff8100ab1c>] call_softirq+0x1c/0x30
<EOI>
[<ffffffff8100c2b6>] ? do_softirq+0x46/0x83
[<ffffffff81396104>] netif_rx_ni+0x26/0x2b
[<ffffffffa0436d6e>] tun_chr_aio_write+0x3ce/0x429 [tun]
[<ffffffffa04369a0>] ? tun_chr_aio_write+0x0/0x429 [tun]
[<ffffffff81104b89>] do_sync_readv_writev+0xc1/0x100
[<ffffffff811d0c2f>] ? selinux_file_permission+0xa7/0xb3
[<ffffffff811048ed>] ? copy_from_user+0x2f/0x31
[<ffffffff811c7149>] ? security_file_permission+0x16/0x18
[<ffffffff811058d3>] do_readv_writev+0xa7/0x127
[<ffffffff81066761>] ? unlock_timer+0x12/0x14
[<ffffffff81066d18>] ? sys_timer_settime+0x258/0x2aa
[<ffffffff81105996>] vfs_writev+0x43/0x4e
[<ffffffff81105a86>] sys_writev+0x4a/0x93
[<ffffffff81009c32>] system_call_fastpath+0x16/0x1b
Code: c7 00 cd dd 81 e8 67 f6 ff ff 48 89 df e8 90 28 00 00 f6 43 78 08
75 2a 48 8b 53 10 48 85 d2 75 04 0f 0b eb fe 48 8b 43 08 a8 01 <48> 89
02 75 04 48 89 50 08 48 b8 00 02 20 00 00 00 ad de 48 89
RIP [<ffffffff813b4115>] destroy_conntrack+0x82/0x114
RSP <ffff880028383c48>
---[ end trace ee1619cd5f767f78 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 2982, comm: qemu-kvm Tainted: G D 2.6.33-rc5 #2
Call Trace:
<IRQ> [<ffffffff81421fb5>] panic+0x7a/0x13d
[<ffffffff81425569>] oops_end+0xb7/0xc7
[<ffffffff8100d35d>] die+0x5a/0x63

Several people have suggested various sysctls. I note that my F12 box
has the following set by default now:

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

This does not fix the problem, although I am indeed using bridged
networking for the guest instances.

At this point, I've disabled loading the firewall modules on this box
since it's behind a firewall anyway and I need it to keep running more
than ten minutes at a time :) but I am obviously interested in helping
to track this down and fix it. I don't know the code in question and I
won't have time to poke much further until the weekend.

Jon.


2010-01-28 05:59:34

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

On Thu, 2010-01-28 at 00:46 -0500, Jon Masters wrote:

> A number of people seem to have reported this crash in various forms,
> but I have yet to see a solution, and can reproduce on 2.6.33-rc5 this
> evening so I know it's still present in the latest upstream kernels too.
> Userspace is Fedora 12, and this happens on both all recent F12 kernels
> (sporadic in 2.6.31 until recently, solidly reproducible on 2.6.32) and
> upstream 2.6.32, and 2.6.33-rc5 also - hard to find a "known good".

Now I can capture the panic()s, I'm rebuilding the 2.6.33-rc5 kernel
with some better debugging options to at least get some more data.

Jon.

2010-01-28 07:20:24

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

On Thu, 2010-01-28 at 00:46 -0500, Jon Masters wrote:

> A number of people seem to have reported this crash in various forms,
> but I have yet to see a solution, and can reproduce on 2.6.33-rc5 this
> evening so I know it's still present in the latest upstream kernels too.
> Userspace is Fedora 12, and this happens on both all recent F12 kernels
> (sporadic in 2.6.31 until recently, solidly reproducible on 2.6.32) and
> upstream 2.6.32, and 2.6.33-rc5 also - hard to find a "known good".
>
> The problem happens when using netfilter with KVM (problem does not
> occur without the firewall loaded, for example) and will occur within a
> few minutes of attempting to start or stop a guest that is connecting to
> the network - the easiest way to reproduce so far is simply to start up
> a bunch of Fedora guests and have them do a "yum update" cycle.
>
> All of the crashes appear similar to the following (2.6.33-rc5):

Rebuilt the kernel with all debug options turned on, got some lockdep
warnings (haven't looked further yet). Here's the output (attached full
boot log also):

[ 205.843071]
[ 205.843073] =======================================================
[ 205.843769] [ INFO: possible circular locking dependency detected ]
[ 205.843769] 2.6.33-rc5 #3
[ 205.843769] -------------------------------------------------------
[ 205.843769] libvirtd/3058 is trying to acquire lock:
[ 205.843769] (s_active){++++.+}, at: [<ffffffff81171a56>]
sysfs_addrm_finish+0x36/0x55
[ 205.843769]
[ 205.843769] but task is already holding lock:
[ 205.843769] (rtnl_mutex){+.+.+.}, at: [<ffffffff813cedd9>] rtnl_lock
+0x17/0x19
[ 205.843769]
[ 205.843769] which lock already depends on the new lock.
[ 205.843769]
[ 205.843769]
[ 205.843769] the existing dependency chain (in reverse order) is:
[ 205.843769]
[ 205.843769] -> #4 (rtnl_mutex){+.+.+.}:
[ 205.843769] [<ffffffff8107ea94>] __lock_acquire+0xb84/0xd3c
[ 205.843769] [<ffffffff8107ed19>] lock_acquire+0xcd/0xf1
[ 205.843769] [<ffffffff8145504b>] __mutex_lock_common
+0x4b/0x392
[ 205.843769] [<ffffffff81455456>] mutex_lock_nested+0x3e/0x43
[ 205.843769] [<ffffffff813cedd9>] rtnl_lock+0x17/0x19
[ 205.843769] [<ffffffff813d05c0>] linkwatch_event+0xe/0x2c
[ 205.843769] [<ffffffff8106713f>] worker_thread+0x266/0x35f
[ 205.843769] [<ffffffff8106b1d0>] kthread+0x9a/0xa2
[ 205.843769] [<ffffffff8100aaa4>] kernel_thread_helper+0x4/0x10
[ 205.843769]
[ 205.843769] -> #3 ((linkwatch_work).work){+.+.+.}:
[ 205.843769] [<ffffffff8107ea94>] __lock_acquire+0xb84/0xd3c
[ 205.843769] [<ffffffff8107ed19>] lock_acquire+0xcd/0xf1
[ 205.843769] [<ffffffff81067136>] worker_thread+0x25d/0x35f
[ 205.843769] [<ffffffff8106b1d0>] kthread+0x9a/0xa2
[ 205.843769] [<ffffffff8100aaa4>] kernel_thread_helper+0x4/0x10
[ 205.843769]
[ 205.843769] -> #2 (events){+.+.+.}:
[ 205.843769] [<ffffffff8107ea94>] __lock_acquire+0xb84/0xd3c
[ 205.843769] [<ffffffff8107ed19>] lock_acquire+0xcd/0xf1
[ 205.843769] [<ffffffff81067a8b>] flush_workqueue+0x63/0xb1
[ 205.843769] [<ffffffff81067aee>] flush_scheduled_work
+0x15/0x17
[ 205.843769] [<ffffffff8138168b>] do_md_stop+0x2c0/0x508
[ 205.843769] [<ffffffff81383b94>] md_ioctl+0x92f/0xf49
[ 205.843769] [<ffffffff8121442d>] __blkdev_driver_ioctl
+0x39/0xa3
[ 205.843769] [<ffffffff81214dc8>] blkdev_ioctl+0x67d/0x6b1
[ 205.843769] [<ffffffff8113ef34>] block_ioctl+0x37/0x3b
[ 205.843769] [<ffffffff81126294>] vfs_ioctl+0x32/0xa6
[ 205.843769] [<ffffffff81126814>] do_vfs_ioctl+0x490/0x4d6
[ 205.843769] [<ffffffff811268b0>] sys_ioctl+0x56/0x79
[ 205.843769] [<ffffffff81009c32>] system_call_fastpath
+0x16/0x1b
[ 205.843769]
[ 205.843769] -> #1 (&new->reconfig_mutex){+.+.+.}:
[ 205.843769] [<ffffffff8107ea94>] __lock_acquire+0xb84/0xd3c
[ 205.843769] [<ffffffff8107ed19>] lock_acquire+0xcd/0xf1
[ 205.843769] [<ffffffff8145504b>] __mutex_lock_common
+0x4b/0x392
[ 205.843769] [<ffffffff814553d0>]
mutex_lock_interruptible_nested+0x3e/0x43
[ 205.843769] [<ffffffff8137bd62>] mddev_lock+0x17/0x19
[ 205.843769] [<ffffffff8137c039>] md_attr_show+0x32/0x5d
[ 205.843769] [<ffffffff811708dc>] sysfs_read_file+0xbd/0x17f
[ 205.843769] [<ffffffff8111983d>] vfs_read+0xab/0x108
[ 205.843769] [<ffffffff8111995a>] sys_read+0x4a/0x6e
[ 205.843769] [<ffffffff81009c32>] system_call_fastpath
+0x16/0x1b
[ 205.843769]
[ 205.843769] -> #0 (s_active){++++.+}:
[ 205.843769] [<ffffffff8107e93e>] __lock_acquire+0xa2e/0xd3c
[ 205.843769] [<ffffffff8107ed19>] lock_acquire+0xcd/0xf1
[ 205.843769] [<ffffffff81171463>] sysfs_deactivate+0x9a/0x103
[ 205.843769] [<ffffffff81171a56>] sysfs_addrm_finish+0x36/0x55
[ 205.843769] [<ffffffff81171aab>] remove_dir+0x36/0x3e
[ 205.843769] [<ffffffff81171b60>] sysfs_remove_dir+0x9d/0xbe
[ 205.843769] [<ffffffff812206c1>] kobject_del+0x16/0x37
[ 205.843769] [<ffffffff8122078a>] kobject_release+0xa8/0xf8
[ 205.843769] [<ffffffff8122165d>] kref_put+0x43/0x4d
[ 205.843769] [<ffffffff8122063c>] kobject_put+0x47/0x4b
[ 205.843769] [<ffffffffa02d5ef5>] br_sysfs_delbr+0x21/0x43
[bridge]
[ 205.843769] [<ffffffffa02d20e9>] del_br+0x5a/0x6f [bridge]
[ 205.843769] [<ffffffffa02d24b2>] br_del_bridge+0x59/0x67
[bridge]
[ 205.843769] [<ffffffffa02d3764>] br_ioctl_deviceless_stub
+0x1fb/0x21b [bridge]
[ 205.843769] [<ffffffff813b32c4>] sock_ioctl+0x141/0x20d
[ 205.843769] [<ffffffff81126294>] vfs_ioctl+0x32/0xa6
[ 205.843769] [<ffffffff81126814>] do_vfs_ioctl+0x490/0x4d6
[ 205.843769] [<ffffffff811268b0>] sys_ioctl+0x56/0x79
[ 205.843769] [<ffffffff81009c32>] system_call_fastpath
+0x16/0x1b
[ 205.843769]
[ 205.843769] other info that might help us debug this:
[ 205.843769]
[ 205.843769] 2 locks held by libvirtd/3058:
[ 205.843769] #0: (br_ioctl_mutex){+.+.+.}, at: [<ffffffff813b32a9>]
sock_ioctl+0x126/0x20d
[ 205.843769] #1: (rtnl_mutex){+.+.+.}, at: [<ffffffff813cedd9>]
rtnl_lock+0x17/0x19
[ 205.843769]
[ 205.843769] stack backtrace:
[ 205.843769] Pid: 3058, comm: libvirtd Not tainted 2.6.33-rc5 #3
[ 205.843769] Call Trace:
[ 205.843769] [<ffffffff8107dae0>] print_circular_bug+0xa8/0xb6
[ 205.843769] [<ffffffff8107e93e>] __lock_acquire+0xa2e/0xd3c
[ 205.843769] [<ffffffff81171a56>] ? sysfs_addrm_finish+0x36/0x55
[ 205.843769] [<ffffffff8107ed19>] lock_acquire+0xcd/0xf1
[ 205.843769] [<ffffffff81171a56>] ? sysfs_addrm_finish+0x36/0x55
[ 205.843769] [<ffffffff81171463>] sysfs_deactivate+0x9a/0x103
[ 205.843769] [<ffffffff81171a56>] ? sysfs_addrm_finish+0x36/0x55
[ 205.843769] [<ffffffff8107c028>] ? trace_hardirqs_off+0xd/0xf
[ 206.367937] [<ffffffff81454e6a>] ? __mutex_unlock_slowpath
+0x120/0x132
[ 206.367937] [<ffffffff81171a56>] sysfs_addrm_finish+0x36/0x55
[ 206.367937] [<ffffffff81171aab>] remove_dir+0x36/0x3e
[ 206.367937] [<ffffffff81171b60>] sysfs_remove_dir+0x9d/0xbe
[ 206.367937] [<ffffffff812206c1>] kobject_del+0x16/0x37
[ 206.367937] [<ffffffff8122078a>] kobject_release+0xa8/0xf8
[ 206.367937] [<ffffffff812206e2>] ? kobject_release+0x0/0xf8
[ 206.367937] [<ffffffff8122165d>] kref_put+0x43/0x4d
[ 206.367937] [<ffffffff8122063c>] kobject_put+0x47/0x4b
[ 206.367937] [<ffffffff8105d218>] ? del_timer_sync+0x0/0xa6
[ 206.367937] [<ffffffffa02d5ef5>] br_sysfs_delbr+0x21/0x43 [bridge]
[ 206.367937] [<ffffffffa02d20e9>] del_br+0x5a/0x6f [bridge]
[ 206.367937] [<ffffffffa02d24b2>] br_del_bridge+0x59/0x67 [bridge]
[ 206.367937] [<ffffffffa02d3764>] br_ioctl_deviceless_stub
+0x1fb/0x21b [bridge]
[ 206.367937] [<ffffffff813b32c4>] sock_ioctl+0x141/0x20d
[ 206.367937] [<ffffffff81126294>] vfs_ioctl+0x32/0xa6
[ 206.367937] [<ffffffff81126814>] do_vfs_ioctl+0x490/0x4d6
[ 206.367937] [<ffffffff811268b0>] sys_ioctl+0x56/0x79
[ 206.367937] [<ffffffff81009c32>] system_call_fastpath+0x16/0x1b
[ 206.482734] kobject: 'brif' (ffff880213a243b8): calling ktype release
[ 206.492872] kobject: (ffff880213a243b8): dynamic_kobj_release
[ 206.498841] kobject: 'brif': free name
[ 206.527833] kobject: 'virbr0' (ffff8801cf3b0500): kobject_uevent_env
[ 206.534360] kobject: 'virbr0' (ffff8801cf3b0500): fill_kobj_path:
path = '/devices/virtual/net/virbr0'
[ 206.551468] kobject: 'virbr0' (ffff8801cf3b0500): kobject_cleanup
[ 206.557611] kobject: 'virbr0' (ffff8801cf3b0500): calling ktype
release
[ 206.564358] kobject: 'virbr0': free name
[ 208.575408] kobject: 'tun' (ffffffffa044b510): kobject_add_internal:
parent: 'module', set: 'module'
[ 208.595562] kobject: 'holders' (ffff8801d352cf68):
kobject_add_internal: parent: 'tun', set: '<NULL>'
[ 208.604990] kobject: 'tun' (ffffffffa044b510): kobject_uevent_env
[ 208.611296] kobject: 'tun' (ffffffffa044b510): fill_kobj_path: path =
'/module/tun'
[ 208.620119] kobject: 'notes' (ffff8801d352c660):
kobject_add_internal: parent: 'tun', set: '<NULL>'
[ 208.630343] tun: Universal TUN/TAP device driver, 1.6
[ 208.635421] tun: (C) 1999-2004 Max Krasnyansky <[email protected]>
[ 208.641721] kobject: 'tun' (ffff8801cd319e08): kobject_add_internal:
parent: 'misc', set: 'devices'
[ 208.651194] kobject: 'tun' (ffff8801cd319e08): kobject_uevent_env
[ 208.657386] kobject: 'tun' (ffff8801cd319e08): fill_kobj_path: path =
'/devices/virtual/misc/tun'
[ 208.671916] kobject: 'vnet0' (ffff880207df35e0):
kobject_add_internal: parent: 'net', set: 'devices'
[ 208.682667] kobject: 'vnet0' (ffff880207df35e0): kobject_uevent_env
[ 208.689435] kobject: 'vnet0' (ffff880207df35e0): fill_kobj_path: path
= '/devices/virtual/net/vnet0'
[ 208.712572] device vnet0 entered promiscuous mode
[ 208.717571] kobject: 'brport' (ffff8801cd378e88):
kobject_add_internal: parent: 'vnet0', set: '<NULL>'
[ 208.728149] kobject: 'brport' (ffff8801cd378e88): kobject_uevent_env
[ 208.734561] kobject: 'brport' (ffff8801cd378e88): kobject_uevent_env:
filter function caused the event to drop!
[ 208.748356] br0: port 2(vnet0) entering forwarding state
[ 209.576245] kobject: 'vnet1' (ffff8801cc9a0500):
kobject_add_internal: parent: 'net', set: 'devices'
[ 209.587167] kobject: 'vnet1' (ffff8801cc9a0500): kobject_uevent_env
[ 209.593773] kobject: 'vnet1' (ffff8801cc9a0500): fill_kobj_path: path
= '/devices/virtual/net/vnet1'
[ 209.614211] device vnet1 entered promiscuous mode
[ 209.619898] kobject: 'brport' (ffff8801cee35308):
kobject_add_internal: parent: 'vnet1', set: '<NULL>'
[ 209.631310] kobject: 'brport' (ffff8801cee35308): kobject_uevent_env
[ 209.638303] kobject: 'brport' (ffff8801cee35308): kobject_uevent_env:
filter function caused the event to drop!
[ 209.654971] br0: port 3(vnet1) entering forwarding state
[ 210.268995] SELinux: initialized (dev mqueue, type mqueue), uses
transition SIDs
[ 210.277526] kobject: 'pid_2' (ffff88020e222128):
kobject_add_internal: parent: 'slab', set: 'slab'
[ 210.288037] kobject: 'pid_2' (ffff88020e222128): kobject_uevent_env
[ 210.294431] kobject: 'pid_2' (ffff88020e222128): fill_kobj_path: path
= '/kernel/slab/pid_2'
[ 210.304019] SELinux: initialized (dev proc, type proc), uses
genfs_contexts
[ 210.397532] SELinux: initialized (dev mqueue, type mqueue), uses
transition SIDs
[ 210.434435] lo: Disabled Privacy Extensions
[ 210.443268] SELinux: initialized (dev proc, type proc), uses
genfs_contexts
[ 210.787652] kobject: 'lo' (ffff880207b725a0): kobject_cleanup
[ 210.793444] kobject: 'lo' (ffff880207b725a0): calling ktype release
[ 210.799783] kobject: 'lo': free name
[ 218.966215] vnet0: no IPv6 routers present
[ 220.336219] vnet1: no IPv6 routers present
[ 238.799662] kobject: 'vnet2' (ffff8801ccba8500):
kobject_add_internal: parent: 'net', set: 'devices'
[ 238.811833] kobject: 'vnet2' (ffff8801ccba8500): kobject_uevent_env
[ 238.818841] kobject: 'vnet2' (ffff8801ccba8500): fill_kobj_path: path
= '/devices/virtual/net/vnet2'
[ 238.841088] device vnet2 entered promiscuous mode
[ 238.846209] kobject: 'brport' (ffff8801cf2d92d0):
kobject_add_internal: parent: 'vnet2', set: '<NULL>'
[ 238.856592] kobject: 'brport' (ffff8801cf2d92d0): kobject_uevent_env
[ 238.863088] kobject: 'brport' (ffff8801cf2d92d0): kobject_uevent_env:
filter function caused the event to drop!
[ 238.878798] br0: port 4(vnet2) entering forwarding state
[ 249.395339] vnet2: no IPv6 routers present
[ 339.729225] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 339.730086] last sysfs
file: /sys/devices/virtual/block/md0/md/sync_speed
[ 339.730086] CPU 7
[ 339.730086] Pid: 3245, comm: qemu-kvm Not tainted 2.6.33-rc5 #3
0F9382/Precision WorkStation 490
[ 339.730086] RIP: 0010:[<ffffffff813e5f3e>] [<ffffffff813e5f3e>]
nf_ct_remove_expectations+0x49/0x5c
[ 339.730086] RSP: 0018:ffff880030603c18 EFLAGS: 00010206
[ 339.730086] RAX: ffff8801c2ea530b RBX: 009ab2000008d300 RCX:
ffffffffa02925c1
[ 339.730086] RDX: 000000000000006b RSI: ffff880030603be8 RDI:
009ab2000008d378
[ 339.730086] RBP: ffff880030603c28 R08: 0000000000000002 R09:
0000000000000000
[ 339.730086] R10: 0000000000000003 R11: ffff88019d28a268 R12:
ffffffff829a0400
[ 339.730086] R13: ffffffffa02925c1 R14: 0000000000000000 R15:
ffff8801a18e4630
[ 339.730086] FS: 00007fb6197ca780(0000) GS:ffff880030600000(0000)
knlGS:0000000000000000
[ 339.730086] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 339.730086] CR2: 00007f747a184178 CR3: 00000001badd8000 CR4:
00000000000026e0
[ 339.730086] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 339.730086] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 339.730086] Process qemu-kvm (pid: 3245, threadinfo ffff8801b9370000,
task ffff8801d53c48a0)
[ 339.730086] Stack:
[ 339.730086] ffffffff81b045d0 ffffffff829a0400 ffff880030603c48
ffffffff813e3426
[ 339.730086] <0> ffff8801a18e4600 ffff88019d28a250 ffff880030603c68
ffffffff813e10f8
[ 339.730086] <0> ffff880030603c68 ffffffff81b045d0 ffff880030603c88
ffffffff813bc080
[ 339.730086] Call Trace:
[ 339.730086] <IRQ>
[ 339.730086] [<ffffffff813e3426>] destroy_conntrack+0x73/0x121
[ 339.730086] [<ffffffff813e10f8>] nf_conntrack_destroy+0x2c/0x33
[ 339.730086] [<ffffffff813bc080>] skb_release_head_state+0x77/0xb9
[ 339.730086] [<ffffffff813bbde2>] __kfree_skb+0x16/0x82
[ 339.730086] [<ffffffff813bbee7>] kfree_skb+0x6a/0x73
[ 339.730086] [<ffffffffa02925c1>] ip6_mc_input+0x214/0x221 [ipv6]
[ 339.730086] [<ffffffffa02925f5>] ip6_rcv_finish+0x27/0x2b [ipv6]
[ 339.730086] [<ffffffffa0292938>] ipv6_rcv+0x33f/0x37f [ipv6]
[ 339.730086] [<ffffffff813c416a>] netif_receive_skb+0x3b0/0x3da
[ 339.730086] [<ffffffffa02d2b6e>] br_handle_frame_finish+0x104/0x13c
[bridge]
[ 339.730086] [<ffffffffa02d2d37>] br_handle_frame+0x191/0x1aa
[bridge]
[ 339.730086] [<ffffffff813c40d6>] netif_receive_skb+0x31c/0x3da
[ 339.730086] [<ffffffff8106e6ee>] ? __run_hrtimer+0xec/0x11b
[ 339.730086] [<ffffffff813c4230>] process_backlog+0x9c/0xd8
[ 339.730086] [<ffffffff813c489b>] net_rx_action+0x7d/0x19c
[ 339.730086] [<ffffffff81055d08>] __do_softirq+0xf8/0x1cd
[ 339.730086] [<ffffffff8100ab9c>] call_softirq+0x1c/0x30
[ 339.730086] <EOI>
[ 339.730086] [<ffffffff8100c38f>] ? do_softirq+0x4b/0xa3
[ 339.730086] [<ffffffff813c49e0>] netif_rx_ni+0x26/0x2b
[ 339.730086] [<ffffffffa0449d6d>] tun_chr_aio_write+0x3ce/0x429 [tun]
[ 339.730086] [<ffffffffa044999f>] ? tun_chr_aio_write+0x0/0x429 [tun]
[ 339.730086] [<ffffffff81118fd2>] do_sync_readv_writev+0xc1/0x100
[ 339.730086] [<ffffffff811e9ee1>] ? selinux_file_permission+0xa7/0xb3
[ 339.730086] [<ffffffff81118d36>] ? copy_from_user+0x2f/0x31
[ 339.730086] [<ffffffff811e02f5>] ? security_file_permission
+0x16/0x18
[ 339.730086] [<ffffffff81119d1f>] do_readv_writev+0xa7/0x127
[ 339.730086] [<ffffffff8111a1d7>] ? rcu_read_unlock+0x21/0x23
[ 339.730086] [<ffffffff8111a2bd>] ? fget_light+0xe4/0xf2
[ 339.730086] [<ffffffff8111a230>] ? fget_light+0x57/0xf2
[ 339.730086] [<ffffffff81119de2>] vfs_writev+0x43/0x4e
[ 339.730086] [<ffffffff81119ed2>] sys_writev+0x4a/0x93
[ 339.730086] [<ffffffff81009c32>] system_call_fastpath+0x16/0x1b
[ 339.730086] Code: 38 0f b6 d2 48 01 d0 74 30 48 8b 58 28 eb 13 48 89
df e8 fc f8 ff ff 48 89 df e8 c8 f8 ff ff 4c 89 e3 48 85 db 74 12 48 8d
7b 78 <4c> 8b 23 e8 e2 73 c7 ff 85 c0 74 e8 eb d6 5b 41 5c c9 c3 55 48
[ 339.730086] RIP [<ffffffff813e5f3e>] nf_ct_remove_expectations
+0x49/0x5c
[ 339.730086] RSP <ffff880030603c18>
[ 340.095082] ---[ end trace 7be6d51534d25d87 ]---
[ 340.099728] Kernel panic - not syncing: Fatal exception in interrupt
[ 340.106123] Pid: 3245, comm: qemu-kvm Tainted: G D 2.6.33-rc5
#3
[ 340.112853] Call Trace:
[ 340.112860] <IRQ> [<ffffffff81453b1d>] panic+0x7a/0x142
[ 340.112886] [<ffffffff81458011>] oops_end+0xb7/0xc7
[ 340.112898] [<ffffffff8100d421>] die+0x5a/0x63
[ 340.112909] [<ffffffff81457bcc>] do_general_protection+0x12a/0x133
[ 340.112922] [<ffffffff81457100>] ? irq_return+0x0/0x10
[ 340.112956] [<ffffffffa02925c1>] ? ip6_mc_input+0x214/0x221 [ipv6]
[ 340.112969] [<ffffffff81457345>] general_protection+0x25/0x30
[ 340.113002] [<ffffffffa02925c1>] ? ip6_mc_input+0x214/0x221 [ipv6]

Jon.


Attachments:
kvm_crash_20100128.txt (78.33 kB)

2010-01-28 08:07:22

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

On Thu, 2010-01-28 at 02:20 -0500, Jon Masters wrote:
> On Thu, 2010-01-28 at 00:46 -0500, Jon Masters wrote:
>
> > A number of people seem to have reported this crash in various forms,
> > but I have yet to see a solution, and can reproduce on 2.6.33-rc5 this
> > evening so I know it's still present in the latest upstream kernels too.
> > Userspace is Fedora 12, and this happens on both all recent F12 kernels
> > (sporadic in 2.6.31 until recently, solidly reproducible on 2.6.32) and
> > upstream 2.6.32, and 2.6.33-rc5 also - hard to find a "known good".
> >
> > The problem happens when using netfilter with KVM (problem does not
> > occur without the firewall loaded, for example) and will occur within a
> > few minutes of attempting to start or stop a guest that is connecting to
> > the network - the easiest way to reproduce so far is simply to start up
> > a bunch of Fedora guests and have them do a "yum update" cycle.
> >
> > All of the crashes appear similar to the following (2.6.33-rc5):
>
> Rebuilt the kernel with all debug options turned on, got some lockdep
> warnings (haven't looked further yet). Here's the output (attached full
> boot log also):

> [ 339.730086] RIP: 0010:[<ffffffff813e5f3e>] [<ffffffff813e5f3e>]
> nf_ct_remove_expectations+0x49/0x5c

This appears to be in the hlist_for_each_entry_safe iteration within
nf_ct_remove_expectations, iterating over the list of nf_conn_help(ers)
returned by nfct_help. I don't know what that code does (I have an idea
but only at a high level at this stage), though I'm poking a little here
to see if I can understand enough of netfilter to be useful. Feel free
to give me some pointers to help you guys debug this faster.

Jon.

2010-01-28 12:19:36

by Patrick McHardy

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

Jon Masters wrote:
> On Thu, 2010-01-28 at 00:46 -0500, Jon Masters wrote:
>
>> A number of people seem to have reported this crash in various forms,
>> but I have yet to see a solution, and can reproduce on 2.6.33-rc5 this
>> evening so I know it's still present in the latest upstream kernels too.
>> Userspace is Fedora 12, and this happens on both all recent F12 kernels
>> (sporadic in 2.6.31 until recently, solidly reproducible on 2.6.32) and
>> upstream 2.6.32, and 2.6.33-rc5 also - hard to find a "known good".
>
> Now I can capture the panic()s, I'm rebuilding the 2.6.33-rc5 kernel
> with some better debugging options to at least get some more data.

Could you try "ip6tables -t raw -I PREROUTING -j TRACE" after loading
the ip6t_LOG module? That way we can hopefully see the entire packet
paths through netfilter.

2010-01-28 21:06:10

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

On Thu, 2010-01-28 at 13:19 +0100, Patrick McHardy wrote:
> Jon Masters wrote:
> > On Thu, 2010-01-28 at 00:46 -0500, Jon Masters wrote:
> >
> >> A number of people seem to have reported this crash in various forms,
> >> but I have yet to see a solution, and can reproduce on 2.6.33-rc5 this
> >> evening so I know it's still present in the latest upstream kernels too.
> >> Userspace is Fedora 12, and this happens on both all recent F12 kernels
> >> (sporadic in 2.6.31 until recently, solidly reproducible on 2.6.32) and
> >> upstream 2.6.32, and 2.6.33-rc5 also - hard to find a "known good".
> >
> > Now I can capture the panic()s, I'm rebuilding the 2.6.33-rc5 kernel
> > with some better debugging options to at least get some more data.
>
> Could you try "ip6tables -t raw -I PREROUTING -j TRACE" after loading
> the ip6t_LOG module? That way we can hopefully see the entire packet
> paths through netfilter.

Doing that now. Turning on a bunch of debug options like netfilter debug
and kmemleak causes it to trigger a lot less often, so I've reverted to
the "known bad" config that triggers this reliably.

Turning on netfilter debug and kmemleak resulted in a large number of:

[18493.405393] nf_conntrack: table full, dropping packet.
[18503.165713] kmemleak: 2 new suspected memory leaks
(see /sys/kernel/debug/kme
mleak)

etc. The memleak warnings don't look all that useful (attached what was
left in the buffer anyway) but the slowdown (KSM and kmemleak both at
near 100% CPU) seems to affect the timing of this and make it less
likely to trigger. The box fell over as I was rebooting it after a few
hours, and just after I detached the serial console to reset it :(

Attached current config. Send you the netfilter trace shortly.

Jon.


Attachments:
config-known-to-trigger-minimal.txt (76.05 kB)
kmemleak_20100128_1 (62.71 kB)
Download all attachments

2010-01-28 21:49:24

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

On Thu, 2010-01-28 at 13:19 +0100, Patrick McHardy wrote:
> ip6tables -t raw -I PREROUTING -j TRACE

I'm not seeing anything on the console - is there additional conntrack
debugging I can enable? I will try building in netfilter debug again.

But I did just get this crash while testing again:

BUG: unable to handle kernel paging request at ffff8802aae52f20
IP: [<ffffffff813b1e7d>] __nf_conntrack_find+0x53/0xb1
PGD 1a3c063 PUD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 5
Pid: 2012, comm: qemu-kvm Not tainted 2.6.33-rc5 #1 0F9382/Precision
WorkStation 490
RIP: 0010:[<ffffffff813b1e7d>] [<ffffffff813b1e7d>] __nf_conntrack_find
+0x53/0xb1
RSP: 0018:ffff880172893778 EFLAGS: 00010282
RAX: ffff8802aae52f20 RBX: ffff880172893878 RCX: 00000000eb883b77
RDX: 000000000001d710 RSI: 0000000081af2f90 RDI: 000000003b9ab4e4
RBP: ffff8801728937b8 R08: 000000008a6bf028 R09: 00000000a2d26753
R10: 0000000067d81e2b R11: ffff88017289389c R12: ffffffff81ddace0
R13: ffff880172893878 R14: ffffffff81ddace0 R15: 00000000137f25e4
FS: 00007f002fe16780(0000) GS:ffff880028340000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff8802aae52f20 CR3: 00000001717dd000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-kvm (pid: 2012, threadinfo ffff880172892000, task
ffff8801702bc5f0)

Jon.

2010-01-28 23:21:46

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

On Thu, 2010-01-28 at 13:19 +0100, Patrick McHardy wrote:
> ip6tables -t raw -I PREROUTING -j TRACE

Ok. Here are three crashes for you in a row, with the correct trace
options set and netfilter debug turned on this time. The latest config
used to reproduce this 100% reliably on 2.6.33-rc5 is attached.

The host system is configured as follows:

[jcm@perihelion ~]$ /sbin/ifconfig
br0 Link encap:Ethernet HWaddr 00:13:72:A0:60:F3
inet addr:192.168.1.3 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::213:72ff:fea0:60f3/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:313 errors:0 dropped:0 overruns:0 frame:0
TX packets:226 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:74337 (72.5 KiB) TX bytes:43151 (42.1 KiB)

eth0 Link encap:Ethernet HWaddr 00:13:72:A0:60:F3
inet6 addr: fe80::213:72ff:fea0:60f3/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:299 errors:0 dropped:0 overruns:0 frame:0
TX packets:229 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:78655 (76.8 KiB) TX bytes:44277 (43.2 KiB)
Interrupt:16

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:151 errors:0 dropped:0 overruns:0 frame:0
TX packets:151 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:16409 (16.0 KiB) TX bytes:16409 (16.0 KiB)

The KVM guests are configured within the 192.168.1 subnet:

kernel.bos.jonmasters.org. 192.168.1.10
fedora.bos.jonmasters.org. 192.168.1.210
rawhide.bos.jonmasters.org. 192.168.1.211

Exact sequence tested (useful mental note and record here):

1). Boot host system. Ensure that standard Fedora F12 out-of-the-box
firewall rules are activated, and iptables/ip6tables start up. This is
after Fedora added the three sysctls to turn off bridge filtering
to /etc/sysctl.conf, which it was thought "worked around" this.
2). Setup trace options for logging netfilter traversal and "echo 7
>/proc/sys/kernel/printk" to set appropriate kernel loglevel.
3). Start libvirtd (autostarts "Fedora-Rawhide-x86_64" and
"kernel-x86_64", which are test VM instances).
4). Login to "kernel" (F12) and do a "yum clean", "yum update
--skip-broken" to generate network traffic over the net. This is the
host that I do autobuilds of Linus' tree on for my twitter feed.
5). Wait a while for first two VMs to start up sshd.
6). Start a third "Fedora-x86_64" VM. Guess it doesn't matter which one,
but I always use this one in my tests to avoid hurting the images.
7). Panic every time, shortly after "port 4(vnet2) entering forwarding
state" in the log messages. It panic()s within about 1 minute.

Let me know what else I can do to help you track this down. I am not a
netfilter developer, but I can follow instructions and learn :) I wish I
had enough time in the day to go learn that code though.

Jon.


Attachments:
config-nflogtracedebug (76.07 kB)
kvm_crash_trace_20100128_1800.txt (49.41 kB)
kvm_crash_trace_20100128_1805.txt (53.43 kB)
kvm_crash_trace_20100128_1810.txt (53.55 kB)
Download all attachments

2010-01-29 01:11:43

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

Hi,

I turned on a few more debug options (eventually, I guess through
timing/sheer chance, it becomes less likely to trigger). With lockdep,
kobject, and spinlock debugging on, all I get in addition is a lockdep
warning from libvirtd:

virbr0: starting userspace STP failed, starting kernel STP

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.33-rc5 #1
-------------------------------------------------------
libvirtd/2908 is trying to acquire lock:
(s_active){++++.+}, at: [<ffffffff81168c6e>] sysfs_addrm_finish
+0x36/0x55

but task is already holding lock:
(rtnl_mutex){+.+.+.}, at: [<ffffffff813b618d>] rtnl_lock+0x17/0x19

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (rtnl_mutex){+.+.+.}:
[<ffffffff8107b8a2>] __lock_acquire+0xb71/0xd19
[<ffffffff8107bb26>] lock_acquire+0xdc/0x102
[<ffffffff8143bdae>] __mutex_lock_common+0x4b/0x392
[<ffffffff8143c1b9>] mutex_lock_nested+0x3e/0x43
[<ffffffff813b618d>] rtnl_lock+0x17/0x19
[<ffffffff813b7974>] linkwatch_event+0xe/0x2c
[<ffffffff810650c8>] worker_thread+0x266/0x35f
[<ffffffff8106913c>] kthread+0x9a/0xa2
[<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10

-> #3 ((linkwatch_work).work){+.+.+.}:
[<ffffffff8107b8a2>] __lock_acquire+0xb71/0xd19
[<ffffffff8107bb26>] lock_acquire+0xdc/0x102
[<ffffffff810650bf>] worker_thread+0x25d/0x35f
[<ffffffff8106913c>] kthread+0x9a/0xa2
[<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10

-> #2 (events){+.+.+.}:
[<ffffffff8107b8a2>] __lock_acquire+0xb71/0xd19
[<ffffffff8107bb26>] lock_acquire+0xdc/0x102
[<ffffffff81065a01>] flush_workqueue+0x63/0xb1
[<ffffffff81065a64>] flush_scheduled_work+0x15/0x17
[<ffffffff813685ff>] do_md_stop+0x2c0/0x508
[<ffffffff8136ab07>] md_ioctl+0x92f/0xf49
[<ffffffff8120afcd>] __blkdev_driver_ioctl+0x39/0xa3
[<ffffffff8120b968>] blkdev_ioctl+0x67d/0x6b1
[<ffffffff811364e8>] block_ioctl+0x37/0x3b
[<ffffffff8111d8a4>] vfs_ioctl+0x32/0xa6
[<ffffffff8111de24>] do_vfs_ioctl+0x490/0x4d6
[<ffffffff8111dec0>] sys_ioctl+0x56/0x79
[<ffffffff81009bf2>] system_call_fastpath+0x16/0x1b

-> #1 (&new->reconfig_mutex){+.+.+.}:
[<ffffffff8107b8a2>] __lock_acquire+0xb71/0xd19
[<ffffffff8107bb26>] lock_acquire+0xdc/0x102
[<ffffffff8143bdae>] __mutex_lock_common+0x4b/0x392
[<ffffffff8143c133>] mutex_lock_interruptible_nested+0x3e/0x43
[<ffffffff81362d1e>] mddev_lock+0x17/0x19
[<ffffffff81362ff5>] md_attr_show+0x32/0x5d
[<ffffffff81167af4>] sysfs_read_file+0xbd/0x17f
[<ffffffff81110ff5>] vfs_read+0xab/0x108
[<ffffffff81111112>] sys_read+0x4a/0x6e
[<ffffffff81009bf2>] system_call_fastpath+0x16/0x1b

-> #0 (s_active){++++.+}:
[<ffffffff8107b74c>] __lock_acquire+0xa1b/0xd19
[<ffffffff8107bb26>] lock_acquire+0xdc/0x102
[<ffffffff8116867b>] sysfs_deactivate+0x9a/0x103
[<ffffffff81168c6e>] sysfs_addrm_finish+0x36/0x55
[<ffffffff81168cc3>] remove_dir+0x36/0x3e
[<ffffffff81168d78>] sysfs_remove_dir+0x9d/0xbe
[<ffffffff8121714b>] kobject_del+0x16/0x37
[<ffffffff81217233>] kobject_release+0xc7/0x1d9
[<ffffffff81218319>] kref_put+0x43/0x4d
[<ffffffff812170c6>] kobject_put+0x47/0x4b
[<ffffffffa02f6ec5>] br_sysfs_delbr+0x21/0x43 [bridge]
[<ffffffffa02f30d5>] del_br+0x5a/0x6f [bridge]
[<ffffffffa02f349e>] br_del_bridge+0x59/0x67 [bridge]
[<ffffffffa02f4740>] br_ioctl_deviceless_stub+0x1fb/0x21b
[bridge]
[<ffffffff8139a7d5>] sock_ioctl+0x141/0x20d
[<ffffffff8111d8a4>] vfs_ioctl+0x32/0xa6
[<ffffffff8111de24>] do_vfs_ioctl+0x490/0x4d6
[<ffffffff8111dec0>] sys_ioctl+0x56/0x79
[<ffffffff81009bf2>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

2 locks held by libvirtd/2908:
#0: (br_ioctl_mutex){+.+.+.}, at: [<ffffffff8139a7ba>] sock_ioctl
+0x126/0x20d
#1: (rtnl_mutex){+.+.+.}, at: [<ffffffff813b618d>] rtnl_lock+0x17/0x19

stack backtrace:
Pid: 2908, comm: libvirtd Not tainted 2.6.33-rc5 #1
Call Trace:
[<ffffffff8107a90f>] print_circular_bug+0xa8/0xb6
[<ffffffff8107b74c>] __lock_acquire+0xa1b/0xd19
[<ffffffff8107bb26>] lock_acquire+0xdc/0x102
[<ffffffff81168c6e>] ? sysfs_addrm_finish+0x36/0x55
[<ffffffff810795fe>] ? lockdep_init_map+0x9e/0x113
[<ffffffff8116867b>] sysfs_deactivate+0x9a/0x103
[<ffffffff81168c6e>] ? sysfs_addrm_finish+0x36/0x55
[<ffffffff81078d72>] ? trace_hardirqs_off+0xd/0xf
[<ffffffff8143bbcd>] ? __mutex_unlock_slowpath+0x120/0x132
[<ffffffff81168c6e>] sysfs_addrm_finish+0x36/0x55
[<ffffffff81168cc3>] remove_dir+0x36/0x3e
[<ffffffff81168d78>] sysfs_remove_dir+0x9d/0xbe
[<ffffffff8121714b>] kobject_del+0x16/0x37
[<ffffffff81217233>] kobject_release+0xc7/0x1d9
[<ffffffff8121716c>] ? kobject_release+0x0/0x1d9
[<ffffffff81218319>] kref_put+0x43/0x4d
[<ffffffff812170c6>] kobject_put+0x47/0x4b
[<ffffffff8105b24c>] ? del_timer_sync+0x0/0xa6
[<ffffffffa02f6ec5>] br_sysfs_delbr+0x21/0x43 [bridge]
[<ffffffffa02f30d5>] del_br+0x5a/0x6f [bridge]
[<ffffffffa02f349e>] br_del_bridge+0x59/0x67 [bridge]
[<ffffffffa02f4740>] br_ioctl_deviceless_stub+0x1fb/0x21b [bridge]
[<ffffffff8139a7d5>] sock_ioctl+0x141/0x20d
[<ffffffff8111d8a4>] vfs_ioctl+0x32/0xa6
[<ffffffff8111de24>] do_vfs_ioctl+0x490/0x4d6
[<ffffffff8111dec0>] sys_ioctl+0x56/0x79
[<ffffffff81009bf2>] system_call_fastpath+0x16/0x1b
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <[email protected]>
device vnet0 entered promiscuous mode
br0: port 2(vnet0) entering forwarding state
device vnet1 entered promiscuous mode
br0: port 3(vnet1) entering forwarding state
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
lo: Disabled Privacy Extensions
SELinux: initialized (dev proc, type proc), uses genfs_contexts
vnet0: no IPv6 routers present
vnet1: no IPv6 routers present
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:16:52:54:00:75:4b:ad:86:dd
SRC=0000:0000:0000:0000:0000:0000:0000:0000
DST=ff02:0000:0000:0000:0000:0000:0000:0016 LEN=76 TC=0 HOPLIMIT=1
FLOWLBL=0 OPT ( ) PROTO=ICMPv6 TYPE=143 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:16:52:54:00:75:4b:ad:86:dd
SRC=0000:0000:0000:0000:0000:0000:0000:0000
DST=ff02:0000:0000:0000:0000:0000:0000:0016 LEN=76 TC=0 HOPLIMIT=1
FLOWLBL=0 OPT ( ) PROTO=ICMPv6 TYPE=143 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:ff:75:4b:ad:52:54:00:75:4b:ad:86:dd
SRC=0000:0000:0000:0000:0000:0000:0000:0000
DST=ff02:0000:0000:0000:0000:0001:ff75:4bad LEN=64 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=135 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:02:52:54:00:75:4b:ad:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe75:4bad
DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:02:52:54:00:75:4b:ad:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe75:4bad
DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:02:52:54:00:75:4b:ad:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe75:4bad
DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:16:52:54:00:75:4b:ad:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe75:4bad
DST=ff02:0000:0000:0000:0000:0000:0000:0016 LEN=76 TC=0 HOPLIMIT=1
FLOWLBL=0 OPT ( ) PROTO=ICMPv6 TYPE=143 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:16:52:54:00:75:4b:ad:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe75:4bad
DST=ff02:0000:0000:0000:0000:0000:0000:0016 LEN=76 TC=0 HOPLIMIT=1
FLOWLBL=0 OPT ( ) PROTO=ICMPv6 TYPE=143 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:16:52:54:00:4c:c6:65:86:dd
SRC=0000:0000:0000:0000:0000:0000:0000:0000
DST=ff02:0000:0000:0000:0000:0000:0000:0016 LEN=76 TC=0 HOPLIMIT=1
FLOWLBL=0 OPT ( ) PROTO=ICMPv6 TYPE=143 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:ff:4c:c6:65:52:54:00:4c:c6:65:86:dd
SRC=0000:0000:0000:0000:0000:0000:0000:0000
DST=ff02:0000:0000:0000:0000:0001:ff4c:c665 LEN=64 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=135 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:02:52:54:00:4c:c6:65:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe4c:c665
DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:16:52:54:00:4c:c6:65:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe4c:c665
DST=ff02:0000:0000:0000:0000:0000:0000:0016 LEN=76 TC=0 HOPLIMIT=1
FLOWLBL=0 OPT ( ) PROTO=ICMPv6 TYPE=143 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:02:52:54:00:4c:c6:65:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe4c:c665
DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:02:52:54:00:4c:c6:65:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe4c:c665
DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:16:52:54:00:4c:c6:65:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe4c:c665
DST=ff02:0000:0000:0000:0000:0000:0000:0016 LEN=76 TC=0 HOPLIMIT=1
FLOWLBL=0 OPT ( ) PROTO=ICMPv6 TYPE=143 CODE=0
TRACE: raw:PREROUTING:policy:2 IN=br0 OUT=
MAC=33:33:00:00:00:16:52:54:00:4c:c6:65:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe4c:c665
DST=ff02:0000:0000:0000:0000:0000:0000:0016 LEN=76 TC=0 HOPLIMIT=1
FLOWLBL=0 OPT ( ) PROTO=ICMPv6 TYPE=143 CODE=0
device vnet2 entered promiscuous mode
br0: port 4(vnet2) entering forwarding state
vnet2: no IPv6 routers present
BUG: unable to handle kernel paging request at ffff8805f74c8598
IP: [<ffffffff813ca6c8>] __nf_conntrack_find+0x53/0xb1
PGD 1a3c063 PUD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/block/md0/md/sync_speed
CPU 2
Pid: 3226, comm: qemu-kvm Not tainted 2.6.33-rc5 #1 0F9382/Precision
WorkStation 490
RIP: 0010:[<ffffffff813ca6c8>] [<ffffffff813ca6c8>] __nf_conntrack_find
+0x53/0xb1
RSP: 0018:ffff8801b92c5768 EFLAGS: 00010282
RAX: ffff8805f74c8598 RBX: ffff8801b92c5878 RCX: 00000000b1528e8e
RDX: ffff8801b9272400 RSI: 0000000081b01fa0 RDI: 0000000000000246
RBP: ffff8801b92c57a8 R08: 0000000098466e0d R09: 0000000044ad0faf
R10: 00000000f1118f58 R11: ffff8801b92c589c R12: ffffffff82967ef0
R13: ffff8801b92c5878 R14: ffffffffffffffb8 R15: 000000007d1b90b3
FS: 00007f7b1d26b780(0000) GS:ffff88002fc00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff8805f74c8598 CR3: 00000001baf3c000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-kvm (pid: 3226, threadinfo ffff8801b92c4000, task
ffff8801b9272400)
Stack:
ffff880214fc04e0 00000003e8dc8598 ffffffffffffffb8 ffffffff82967ef0
<0> ffff880214fc04e0 ffff8801b92c5878 ffffffffffffffb8 ffffffff82967ef0
<0> ffff8801b92c57f8 ffffffff813ca759 ffff880214fc04e0 ffff880214fc04e0
Call Trace:
[<ffffffff813ca759>] nf_conntrack_find_get+0x33/0xb7
[<ffffffff813cb810>] nf_conntrack_in+0x209/0x7d0
[<ffffffff8140f2b3>] ipv4_conntrack_local+0x40/0x49
[<ffffffff813c85c9>] nf_iterate+0x4d/0xab
[<ffffffff813da338>] ? dst_output+0x0/0x12
[<ffffffff813c869f>] nf_hook_slow+0x78/0xe0
[<ffffffff813da338>] ? dst_output+0x0/0x12
[<ffffffff81069602>] ? autoremove_wake_function+0x24/0x39
[<ffffffff813db42a>] __ip_local_out+0x8b/0x8d
[<ffffffff813db442>] ip_local_out+0x16/0x27
[<ffffffff813dba70>] ip_queue_xmit+0x30e/0x36e
[<ffffffff813ed9c8>] tcp_transmit_skb+0x707/0x745
[<ffffffff813f003a>] tcp_write_xmit+0x7cb/0x8ba
[<ffffffff813f018e>] __tcp_push_pending_frames+0x2f/0x5d
[<ffffffff813e2dab>] tcp_push+0x88/0x8a
[<ffffffff813e50cc>] tcp_sendmsg+0x760/0x85b
[<ffffffff8139b42c>] __sock_sendmsg+0x5e/0x69
[<ffffffff8139b742>] sock_sendmsg+0xa8/0xc1
[<ffffffff81111999>] ? fget_light+0x57/0xf2
[<ffffffff81111940>] ? rcu_read_unlock+0x21/0x23
[<ffffffff81111999>] ? fget_light+0x57/0xf2
[<ffffffff81140cca>] ? eventfd_write+0x94/0x186
[<ffffffff8139b7d2>] ? sockfd_lookup_light+0x20/0x58
[<ffffffff8139d497>] sys_sendto+0x110/0x152
[<ffffffff81110670>] ? fsnotify_modify+0x6c/0x74
[<ffffffff81118ddf>] ? path_put+0x22/0x26
[<ffffffff810a1671>] ? audit_syscall_entry+0x11e/0x14a
[<ffffffff8143cf2a>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff81009bf2>] system_call_fastpath+0x16/0x1b
Code: 48 89 df e8 c9 f4 ff ff 41 89 c7 45 89 ff e8 fb 93 c8 ff 4a 8d 04
fd 00 00 00 00 48 89 45 c8 48 8b 45 c8 49 03 84 24 90 06 00 00 <4c> 8b
28 eb 14 65 83 40 04 01 e8 d0 94 c8 ff eb 3b 65 83 00 01
RIP [<ffffffff813ca6c8>] __nf_conntrack_find+0x53/0xb1
RSP <ffff8801b92c5768>
CR2: ffff8805f74c8598
---[ end trace d624218af5f5f65a ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 3226, comm: qemu-kvm Tainted: G D 2.6.33-rc5 #1
Call Trace:
[<ffffffff8143a899>] panic+0x7a/0x142
[<ffffffff8143ed83>] oops_end+0xb7/0xc7
[<ffffffff8102f719>] no_context+0x1fc/0x20b
[<ffffffff8102f8ba>] __bad_area_nosemaphore+0x192/0x1b5
[<ffffffff8107a4db>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff8102f8f0>] bad_area_nosemaphore+0x13/0x15
[<ffffffff81440875>] do_page_fault+0x159/0x2aa
[<ffffffff8143e0f5>] page_fault+0x25/0x30
[<ffffffff813ca6c8>] ? __nf_conntrack_find+0x53/0xb1
[<ffffffff813ca6b0>] ? __nf_conntrack_find+0x3b/0xb1
[<ffffffff813ca759>] nf_conntrack_find_get+0x33/0xb7
[<ffffffff813cb810>] nf_conntrack_in+0x209/0x7d0
[<ffffffff8140f2b3>] ipv4_conntrack_local+0x40/0x49
[<ffffffff813c85c9>] nf_iterate+0x4d/0xab
[<ffffffff813da338>] ? dst_output+0x0/0x12
[<ffffffff813c869f>] nf_hook_slow+0x78/0xe0
[<ffffffff813da338>] ? dst_output+0x0/0x12
[<ffffffff81069602>] ? autoremove_wake_function+0x24/0x39
[<ffffffff813db42a>] __ip_local_out+0x8b/0x8d
[<ffffffff813db442>] ip_local_out+0x16/0x27
[<ffffffff813dba70>] ip_queue_xmit+0x30e/0x36e
[<ffffffff813ed9c8>] tcp_transmit_skb+0x707/0x745
[<ffffffff813f003a>] tcp_write_xmit+0x7cb/0x8ba
[<ffffffff813f018e>] __tcp_push_pending_frames+0x2f/0x5d
[<ffffffff813e2dab>] tcp_push+0x88/0x8a
[<ffffffff813e50cc>] tcp_sendmsg+0x760/0x85b
[<ffffffff8139b42c>] __sock_sendmsg+0x5e/0x69
[<ffffffff8139b742>] sock_sendmsg+0xa8/0xc1
[<ffffffff81111999>] ? fget_light+0x57/0xf2
[<ffffffff81111940>] ? rcu_read_unlock+0x21/0x23
[<ffffffff81111999>] ? fget_light+0x57/0xf2
[<ffffffff81140cca>] ? eventfd_write+0x94/0x186
[<ffffffff8139b7d2>] ? sockfd_lookup_light+0x20/0x58
[<ffffffff8139d497>] sys_sendto+0x110/0x152
[<ffffffff81110670>] ? fsnotify_modify+0x6c/0x74
[<ffffffff81118ddf>] ? path_put+0x22/0x26
[<ffffffff810a1671>] ? audit_syscall_entry+0x11e/0x14a
[<ffffffff8143cf2a>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff81009bf2>] system_call_fastpath+0x16/0x1b
[drm:drm_fb_helper_panic] *ERROR* panic occurred, switching back to text
console
Rebooting in 30 sec

Jon.

2010-01-29 08:42:55

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

Hi,

So I did some poking (still trying to figure out netfilter a little
internally) and looked over the handling of connection tracking. The
oops reports I have been getting generally lie in __nf_conntrack_find,
specifically within a hlist iterator that looks up the information for
the current connection in a per-net namespace hashtable (under RCU, it's
been locked already by the time we get in here). Here's the piece:

hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash],
hnnode) {
if (nf_ct_tuple_equal(tuple, &h->tuple)) {
NF_CT_STAT_INC(net, found);
local_bh_enable();
return h;
}
NF_CT_STAT_INC(net, searched);
}

Instrumenting the kernel at the moment and then setting up more of a
debugging environment to poke at what goes wrong here. Perhaps there's
some broken RCU assumption - I just spent the last few hours reading
over netfilter source and Paul's RCU docs again to brush up.

Perhaps you netdev folks can let me know if there's a handy netfilter
debugging guide somewhere.

Jon.

2010-01-29 09:10:39

by Eric Dumazet

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

Le vendredi 29 janvier 2010 à 03:42 -0500, Jon Masters a écrit :
> Hi,
>
> So I did some poking (still trying to figure out netfilter a little
> internally) and looked over the handling of connection tracking. The
> oops reports I have been getting generally lie in __nf_conntrack_find,
> specifically within a hlist iterator that looks up the information for
> the current connection in a per-net namespace hashtable (under RCU, it's
> been locked already by the time we get in here). Here's the piece:
>
> hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash],
> hnnode) {
> if (nf_ct_tuple_equal(tuple, &h->tuple)) {
> NF_CT_STAT_INC(net, found);
> local_bh_enable();
> return h;
> }
> NF_CT_STAT_INC(net, searched);
> }
>
> Instrumenting the kernel at the moment and then setting up more of a
> debugging environment to poke at what goes wrong here. Perhaps there's
> some broken RCU assumption - I just spent the last few hours reading
> over netfilter source and Paul's RCU docs again to brush up.
>
> Perhaps you netdev folks can let me know if there's a handy netfilter
> debugging guide somewhere.
>
> Jon.
>
>

Jon, do you have multiple network namespace active on your machine, when
crash occurs ?


2010-01-29 10:57:47

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

On Fri, 2010-01-29 at 10:10 +0100, Eric Dumazet wrote:

> Jon, do you have multiple network namespace active on your machine, when
> crash occurs ?

I don't believe so. I just built in Jason's latest kgdb/kdb patches and
am going to give them a whirl to see if I can get anything more out of
this panic than I currently have.

Jon.

2010-01-29 21:51:45

by Jon Masters

[permalink] [raw]
Subject: Re: PROBLEM: reproducible crash KVM+nf_conntrack all recent 2.6 kernels

On Fri, 2010-01-29 at 05:57 -0500, Jon Masters wrote:
> On Fri, 2010-01-29 at 10:10 +0100, Eric Dumazet wrote:
>
> > Jon, do you have multiple network namespace active on your machine, when
> > crash occurs ?
>
> I don't believe so. I just built in Jason's latest kgdb/kdb patches and
> am going to give them a whirl to see if I can get anything more out of
> this panic than I currently have.

So, the latest crash was in here:

/* delete all expectations for this conntrack */
void nf_ct_remove_expectations(struct nf_conn *ct)
{
struct nf_conn_help *help = nfct_help(ct);
struct nf_conntrack_expect *exp;
struct hlist_node *n, *next;

/* Optimization: most connection never expect any others. */
if (!help)
return;

hlist_for_each_entry_safe(exp, n, next, &help->expectations,
lnode) {
if (del_timer(&exp->timeout)) {
nf_ct_unlink_expect(exp);
nf_ct_expect_put(exp);
}
}
}
EXPORT_SYMBOL_GPL(nf_ct_remove_expectations);

Specifically, in that hlist_for_each_entry_safe iteration, the list of
expectations is already NULL:

at net/netfilter/nf_conntrack_expect.c:174
174 hlist_for_each_entry_safe(exp, n, next,
&help->expectations, lnode) {

(gdb) bt
#0 nf_ct_remove_expectations (ct=<value optimized out>)
at net/netfilter/nf_conntrack_expect.c:174
#1 0xffffffff813e4db9 in destroy_conntrack (nfct=0xffffffff81b04e60)
at net/netfilter/nf_conntrack_core.c:202
#2 0xffffffff813e2af8 in nf_conntrack_destroy (nfct=<value optimized
out>)
at net/netfilter/core.c:243
#3 0xffffffff813bc209 in nf_conntrack_put (skb=0xffff88018fb97f00)
at include/linux/skbuff.h:1924
#4 skb_release_head_state (skb=0xffff88018fb97f00) at
net/core/skbuff.c:402
#5 0xffffffff813bbf6b in skb_release_all (skb=0xffff88018fb97f00)
at net/core/skbuff.c:420
#6 __kfree_skb (skb=0xffff88018fb97f00) at net/core/skbuff.c:435
#7 0xffffffff813bc070 in kfree_skb (skb=0xffff88018fb97f00)
at net/core/skbuff.c:456
#8 0xffffffffa03fa0a1 in ?? ()
#9 0x0000000000000050 in ?? ()
#10 0xffff8801de810780 in ?? ()
#11 0x0000000000000000 in ?? ()

(gdb) frame 1
#1 0xffffffff813e4db9 in destroy_conntrack (nfct=0xffffffff81b04e60)
at net/netfilter/nf_conntrack_core.c:202
202 nf_ct_remove_expectations(ct);
(gdb) print ct
$5 = (struct nf_conn *) 0xffffffff81b04e60
(gdb) print ct->
ct_general ext mark proto status tuplehash
ct_net lock master secmark timeout

(gdb) print ct->ext
$10 = (struct nf_ct_ext *) 0xffff8801de3369c0

(gdb) print (struct nf_conn_help *)(ct->ext +
ct->ext->offset[NF_CT_EXT_HELPER])
$22 = (struct nf_conn_help *) 0xffff8801de337a40

(gdb) print $22->helper
$23 = (struct nf_conntrack_helper *) 0xffff8801de337a00
(gdb) print $22->help
$24 = {ct_ftp_info = {seq_aft_nl = {{0, 0}, {269970752, 4294936578}},
seq_aft_nl_num = {323805184, 32555}}, ct_pptp_info = {
sstate = PPTP_SESSION_NONE, cstate = PPTP_CALL_NONE, pac_call_id =
27968,
pns_call_id = 4119, keymap = {0x7f2b134ce000, 0xb65e2965}},
ct_h323_info = {sig_port = {0, 0}, rtp_port = {{0, 0}, {27968, 4119},
{
34818, 65535}, {57344, 4940}}, {timeout = 32555, tpkt_len =
{32555,
0}}}, ct_sane_info = {state = SANE_STATE_NORMAL}, ct_sip_info =
{
register_cseq = 0, invite_cseq = 0}}
(gdb) print $22->expectations
$25 = {first = 0x0}
(gdb) print $22->expecting
$26 = "\000\000"

(gdb) print $22->helper->hnode
$28 = {next = 0xffff8801de3379c0, pprev = 0x0}

(gdb) print $22->helper->name
$31 = 0xffff880210176d40 "\300\235\363\020\002\210\377\377\020\027\b\022
\002\210\377\377p\236\363\020\002\210\377\377T", <incomplete sequence
\337>
(gdb) print $22->helper->me
$32 = (struct module *) 0x7f2b134cf000

(gdb) print $22->helper->tuple
$36 = {src = {u3 = {all = {0, 0, 0, 0}, ip = 0, ip6 = {0, 0, 0, 0}, in =
{
s_addr = 0}, in6 = {in6_u = {u6_addr8 = '\000' <repeats 15
times>,
u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, u6_addr32 = {0, 0, 0,
0}}}},
u = {all = 0, tcp = {port = 0}, udp = {port = 0}, icmp = {id = 0},
dccp = {
port = 0}, sctp = {port = 0}, gre = {key = 0}}, l3num = 0}, dst
= {
u3 = {all = {0, 3727915520, 4294936577, 0}, ip = 0, ip6 = {0,
3727915520,
4294936577, 0}, in = {s_addr = 0}, in6 = {in6_u = {
u6_addr8 = "\000\000\000\000\000z3\336\001\210\377\377\000\000
\000",
u6_addr16 = {0, 0, 31232, 56883, 34817, 65535, 0, 0},
u6_addr32 = {
0, 3727915520, 4294936577, 0}}}}, u = {all = 0, tcp = {port
= 0},
udp = {port = 0}, icmp = {type = 0 '\000', code = 0 '\000'}, dccp
= {
port = 0}, sctp = {port = 0}, gre = {key = 0}}, protonum = 0
'\000',
dir = 0 '\000'}}

Jon.