LinuxLists.cc - 2.6.33RC3 Sky2 oops - Driver tries to sync DMA memory it has not allocated

2010-01-09 22:22:30

Subject: 2.6.33RC3 Sky2 oops - Driver tries to sync DMA memory it has not allocated

Hi,

Attempting to move back to mainline after my recent 2.6.32 issues...
Config is make oldconfig from working 2.6.32 config. Patch for
af_packet.c (for skb issue found in 2.6.32) included. Attaching .config
and NMI backtraces.

System becomes unusable after bringing up the network:

Jan 9 16:36:50 mail kernel: ------------[ cut here ]------------
Jan 9 16:36:50 mail kernel: WARNING: at lib/dma-debug.c:902
check_sync+0xbd/0x426()
Jan 9 16:36:50 mail kernel: Hardware name: System Product Name
Jan 9 16:36:50 mail kernel: sky2 0000:04:00.0: DMA-API: device driver
tries to sync DMA memory it has not allocated [device
address=0x0000000311686822] [size=60 bytes]
Jan 9 16:36:50 mail kernel: Modules linked in: bridge stp appletalk
psnap llc nfsd lockd nfs_acl auth_rpcgss exportfs hwmon_vid coretemp
sunrpc acpi_cpufreq sit tunnel4 ipt_LOG ipt_MASQUERADE iptable_nat
nf_nat iptable_mangle iptable_raw nf_conntrack_netbios_ns
nf_conntrack_ftp nf_conntrack_ipv6 xt_multiport ip6table_filter xt_DSCP
xt_dscp xt_MARK ip6table_mangle ip6_tables ipv6 dm_multipath kvm_intel
kvm snd_hda_codec_analog snd_ens1371 gameport snd_rawmidi snd_ac97_codec
snd_hda_intel ac97_bus snd_hda_codec snd_hwdep snd_seq gspca_spca505
snd_seq_device gspca_main snd_pcm videodev snd_timer snd v4l1_compat
v4l2_compat_ioctl32 firewire_ohci soundcore snd_page_alloc iTCO_wdt
i2c_i801 iTCO_vendor_support firewire_core crc_itu_t sky2 pcspkr wmi
asus_atk0110 hwmon fbcon tileblit font bitblit softcursor raid456
async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx
raid1 ata_generic pata_acpi pata_marvell nouveau ttm drm_kms_helper drm
agpgart fb i2c_algo_bit cfbcopyarea i2c_core cfbimgblt cfbfil
Jan 9 16:36:50 mail kernel: lrect [last unloaded: scsi_wait_scan]
Jan 9 16:36:50 mail kernel: Pid: 5271, comm: libvirtd Not tainted
2.6.33-rc3WITHMMAPNODMAR-00147-g3c8ad49-dirty #1
Jan 9 16:36:50 mail kernel: Call Trace:
Jan 9 16:36:50 mail kernel: <IRQ> [<ffffffff81049fe5>]
warn_slowpath_common+0x7c/0x94
Jan 9 16:36:50 mail kernel: [<ffffffff8104a054>]
warn_slowpath_fmt+0x41/0x43
Jan 9 16:36:50 mail kernel: [<ffffffff81261c0a>] check_sync+0xbd/0x426
Jan 9 16:36:50 mail kernel: [<ffffffff813b2aff>] ?
__netdev_alloc_skb+0x34/0x50
Jan 9 16:36:50 mail kernel: [<ffffffff812622c6>]
debug_dma_sync_single_for_cpu+0x42/0x44
Jan 9 16:36:50 mail kernel: [<ffffffff8125f6c7>] ?
swiotlb_sync_single+0x2a/0xb6
Jan 9 16:36:50 mail kernel: [<ffffffff8125f823>] ?
swiotlb_sync_single_for_cpu+0xc/0xe
Jan 9 16:36:50 mail kernel: [<ffffffffa018efcb>] sky2_poll+0x4d5/0xaf0
[sky2]
Jan 9 16:36:50 mail kernel: [<ffffffff8106a1a3>] ?
sched_clock_cpu+0x44/0xce
Jan 9 16:36:50 mail kernel: [<ffffffff81070573>] ?
clockevents_program_event+0x7a/0x83
Jan 9 16:36:50 mail kernel: [<ffffffff813b9766>] net_rx_action+0xb5/0x1f0
Jan 9 16:36:50 mail kernel: [<ffffffff8105059c>] __do_softirq+0xf8/0x1cd
Jan 9 16:36:50 mail kernel: [<ffffffff8109389a>] ?
handle_IRQ_event+0x119/0x12b
Jan 9 16:36:50 mail kernel: [<ffffffff8100ab1c>] call_softirq+0x1c/0x30
Jan 9 16:36:50 mail kernel: [<ffffffff8100c2b3>] do_softirq+0x4b/0xa3
Jan 9 16:36:50 mail kernel: [<ffffffff81050188>] irq_exit+0x4a/0x8c
Jan 9 16:36:50 mail kernel: [<ffffffff8145a83c>] do_IRQ+0xac/0xc3
Jan 9 16:36:50 mail kernel: [<ffffffff81455a93>] ret_from_intr+0x0/0x16
Jan 9 16:36:50 mail kernel: <EOI> [<ffffffff8104474c>] ?
set_cpus_allowed_ptr+0x22/0x14b
Jan 9 16:36:50 mail kernel: [<ffffffff81087aff>]
cpuset_attach_task+0x27/0x9c
Jan 9 16:36:50 mail kernel: [<ffffffff81087bfe>] cpuset_attach+0x8a/0x133
Jan 9 16:36:50 mail kernel: [<ffffffff81042cba>] ?
sched_move_task+0x104/0x110
Jan 9 16:36:50 mail kernel: [<ffffffff81085b4f>]
cgroup_attach_task+0x4d5/0x533
Jan 9 16:36:50 mail kernel: [<ffffffff81085e05>] cgroup_clone+0x258/0x2ac
Jan 9 16:36:50 mail kernel: [<ffffffff81088a74>] ns_cgroup_clone+0x58/0x75
Jan 9 16:36:50 mail kernel: [<ffffffff81048ec1>] copy_process+0xcef/0x13af
Jan 9 16:36:50 mail kernel: [<ffffffff810d9044>] ?
handle_mm_fault+0x355/0x7ff
Jan 9 16:36:50 mail kernel: [<ffffffff8108f769>] ?
audit_filter_rules+0x19a/0x7c5
Jan 9 16:36:50 mail kernel: [<ffffffff810496ec>] do_fork+0x16b/0x309
Jan 9 16:36:50 mail kernel: [<ffffffff81251d12>] ? __up_read+0x82/0x8a
Jan 9 16:36:50 mail kernel: [<ffffffff81010f22>] sys_clone+0x28/0x2a
Jan 9 16:36:50 mail kernel: [<ffffffff81009f33>] stub_clone+0x13/0x20
Jan 9 16:36:50 mail kernel: [<ffffffff81009bf2>] ?
system_call_fastpath+0x16/0x1b
Jan 9 16:36:50 mail kernel: ---[ end trace cd5e0588bad4ec83 ]---
Then... after a few more normal boot messages (samba starting up, etc.)
I just see rcu stalls with NMI backtraces for each cpu. I've attached
the first one - the rcu stall oops repeats until the reboot I forced.

Attachments:

messages (33.34 kB)
config (83.91 kB)
Download all attachments

2010-01-10 20:10:56

by Michael Breuer

[permalink] [raw]

Subject: Re: 2.6.33RC3 libvirtd ->sky2 & rcu oops (was Sky2 oops - Driver tries to sync DMA memory it has not allocated)

On 1/9/2010 5:21 PM, Michael Breuer wrote:
> Hi,
>
> Attempting to move back to mainline after my recent 2.6.32 issues...
> Config is make oldconfig from working 2.6.32 config. Patch for
> af_packet.c (for skb issue found in 2.6.32) included. Attaching
> .config and NMI backtraces.
>
> System becomes unusable after bringing up the network:
>
> Jan 9 16:36:50 mail kernel: ------------[ cut here ]------------
> Jan 9 16:36:50 mail kernel: WARNING: at lib/dma-debug.c:902
> check_sync+0xbd/0x426()
> Jan 9 16:36:50 mail kernel: Hardware name: System Product Name
> Jan 9 16:36:50 mail kernel: sky2 0000:04:00.0: DMA-API: device driver
> tries to sync DMA memory it has not allocated [device
> address=0x0000000311686822] [size=60 bytes]
> Jan 9 16:36:50 mail kernel: Modules linked in: bridge stp appletalk
> psnap llc nfsd lockd nfs_acl auth_rpcgss exportfs hwmon_vid coretemp
> sunrpc acpi_cpufreq sit tunnel4 ipt_LOG ipt_MASQUERADE iptable_nat
> nf_nat iptable_mangle iptable_raw nf_conntrack_netbios_ns
> nf_conntrack_ftp nf_conntrack_ipv6 xt_multiport ip6table_filter
> xt_DSCP xt_dscp xt_MARK ip6table_mangle ip6_tables ipv6 dm_multipath
> kvm_intel kvm snd_hda_codec_analog snd_ens1371 gameport snd_rawmidi
> snd_ac97_codec snd_hda_intel ac97_bus snd_hda_codec snd_hwdep snd_seq
> gspca_spca505 snd_seq_device gspca_main snd_pcm videodev snd_timer snd
> v4l1_compat v4l2_compat_ioctl32 firewire_ohci soundcore snd_page_alloc
> iTCO_wdt i2c_i801 iTCO_vendor_support firewire_core crc_itu_t sky2
> pcspkr wmi asus_atk0110 hwmon fbcon tileblit font bitblit softcursor
> raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy
> async_tx raid1 ata_generic pata_acpi pata_marvell nouveau ttm
> drm_kms_helper drm agpgart fb i2c_algo_bit cfbcopyarea i2c_core
> cfbimgblt cfbfil
> Jan 9 16:36:50 mail kernel: lrect [last unloaded: scsi_wait_scan]
> Jan 9 16:36:50 mail kernel: Pid: 5271, comm: libvirtd Not tainted
> 2.6.33-rc3WITHMMAPNODMAR-00147-g3c8ad49-dirty #1
> Jan 9 16:36:50 mail kernel: Call Trace:
> Jan 9 16:36:50 mail kernel: <IRQ> [<ffffffff81049fe5>]
> warn_slowpath_common+0x7c/0x94
> Jan 9 16:36:50 mail kernel: [<ffffffff8104a054>]
> warn_slowpath_fmt+0x41/0x43
> Jan 9 16:36:50 mail kernel: [<ffffffff81261c0a>] check_sync+0xbd/0x426
> Jan 9 16:36:50 mail kernel: [<ffffffff813b2aff>] ?
> __netdev_alloc_skb+0x34/0x50
> Jan 9 16:36:50 mail kernel: [<ffffffff812622c6>]
> debug_dma_sync_single_for_cpu+0x42/0x44
> Jan 9 16:36:50 mail kernel: [<ffffffff8125f6c7>] ?
> swiotlb_sync_single+0x2a/0xb6
> Jan 9 16:36:50 mail kernel: [<ffffffff8125f823>] ?
> swiotlb_sync_single_for_cpu+0xc/0xe
> Jan 9 16:36:50 mail kernel: [<ffffffffa018efcb>]
> sky2_poll+0x4d5/0xaf0 [sky2]
> Jan 9 16:36:50 mail kernel: [<ffffffff8106a1a3>] ?
> sched_clock_cpu+0x44/0xce
> Jan 9 16:36:50 mail kernel: [<ffffffff81070573>] ?
> clockevents_program_event+0x7a/0x83
> Jan 9 16:36:50 mail kernel: [<ffffffff813b9766>]
> net_rx_action+0xb5/0x1f0
> Jan 9 16:36:50 mail kernel: [<ffffffff8105059c>] __do_softirq+0xf8/0x1cd
> Jan 9 16:36:50 mail kernel: [<ffffffff8109389a>] ?
> handle_IRQ_event+0x119/0x12b
> Jan 9 16:36:50 mail kernel: [<ffffffff8100ab1c>] call_softirq+0x1c/0x30
> Jan 9 16:36:50 mail kernel: [<ffffffff8100c2b3>] do_softirq+0x4b/0xa3
> Jan 9 16:36:50 mail kernel: [<ffffffff81050188>] irq_exit+0x4a/0x8c
> Jan 9 16:36:50 mail kernel: [<ffffffff8145a83c>] do_IRQ+0xac/0xc3
> Jan 9 16:36:50 mail kernel: [<ffffffff81455a93>] ret_from_intr+0x0/0x16
> Jan 9 16:36:50 mail kernel: <EOI> [<ffffffff8104474c>] ?
> set_cpus_allowed_ptr+0x22/0x14b
> Jan 9 16:36:50 mail kernel: [<ffffffff81087aff>]
> cpuset_attach_task+0x27/0x9c
> Jan 9 16:36:50 mail kernel: [<ffffffff81087bfe>]
> cpuset_attach+0x8a/0x133
> Jan 9 16:36:50 mail kernel: [<ffffffff81042cba>] ?
> sched_move_task+0x104/0x110
> Jan 9 16:36:50 mail kernel: [<ffffffff81085b4f>]
> cgroup_attach_task+0x4d5/0x533
> Jan 9 16:36:50 mail kernel: [<ffffffff81085e05>]
> cgroup_clone+0x258/0x2ac
> Jan 9 16:36:50 mail kernel: [<ffffffff81088a74>]
> ns_cgroup_clone+0x58/0x75
> Jan 9 16:36:50 mail kernel: [<ffffffff81048ec1>]
> copy_process+0xcef/0x13af
> Jan 9 16:36:50 mail kernel: [<ffffffff810d9044>] ?
> handle_mm_fault+0x355/0x7ff
> Jan 9 16:36:50 mail kernel: [<ffffffff8108f769>] ?
> audit_filter_rules+0x19a/0x7c5
> Jan 9 16:36:50 mail kernel: [<ffffffff810496ec>] do_fork+0x16b/0x309
> Jan 9 16:36:50 mail kernel: [<ffffffff81251d12>] ? __up_read+0x82/0x8a
> Jan 9 16:36:50 mail kernel: [<ffffffff81010f22>] sys_clone+0x28/0x2a
> Jan 9 16:36:50 mail kernel: [<ffffffff81009f33>] stub_clone+0x13/0x20
> Jan 9 16:36:50 mail kernel: [<ffffffff81009bf2>] ?
> system_call_fastpath+0x16/0x1b
> Jan 9 16:36:50 mail kernel: ---[ end trace cd5e0588bad4ec83 ]---
> Then... after a few more normal boot messages (samba starting up,
> etc.) I just see rcu stalls with NMI backtraces for each cpu. I've
> attached the first one - the rcu stall oops repeats until the reboot I
> forced.
Tracked this down to libvirtd. No idea why yet - but these oops occur
when starting libvirtd. Version of libvirt is 0.7.0-15.fc12.x86_64.

Also, checking back to 2.6.32 - found that the sky2 oops listed above
also occurs (started it seems after an update to
libvirt-java-0.4.0-1.fc12.noarch two days ago). However the subsequent
rcu stall doesn't happen on 2.6.32 - system behaves normally (which is
why I missed the oops).
Now running OK on 2.6.33 w/o libvirtd.

2010-01-12 01:49:14

by Paul E. McKenney

[permalink] [raw]

Subject: Re: 2.6.33RC3 libvirtd ->sky2 & rcu oops (was Sky2 oops - Driver tries to sync DMA memory it has not allocated)

On Sun, Jan 10, 2010 at 03:10:03PM -0500, Michael Breuer wrote:
> On 1/9/2010 5:21 PM, Michael Breuer wrote:
>> Hi,
>>
>> Attempting to move back to mainline after my recent 2.6.32 issues...
>> Config is make oldconfig from working 2.6.32 config. Patch for af_packet.c
>> (for skb issue found in 2.6.32) included. Attaching .config and NMI
>> backtraces.
>>
>> System becomes unusable after bringing up the network:
>>
>> Jan 9 16:36:50 mail kernel: ------------[ cut here ]------------
>> Jan 9 16:36:50 mail kernel: WARNING: at lib/dma-debug.c:902
>> check_sync+0xbd/0x426()
>> Jan 9 16:36:50 mail kernel: Hardware name: System Product Name
>> Jan 9 16:36:50 mail kernel: sky2 0000:04:00.0: DMA-API: device driver
>> tries to sync DMA memory it has not allocated [device
>> address=0x0000000311686822] [size=60 bytes]
>> Jan 9 16:36:50 mail kernel: Modules linked in: bridge stp appletalk psnap
>> llc nfsd lockd nfs_acl auth_rpcgss exportfs hwmon_vid coretemp sunrpc
>> acpi_cpufreq sit tunnel4 ipt_LOG ipt_MASQUERADE iptable_nat nf_nat
>> iptable_mangle iptable_raw nf_conntrack_netbios_ns nf_conntrack_ftp
>> nf_conntrack_ipv6 xt_multiport ip6table_filter xt_DSCP xt_dscp xt_MARK
>> ip6table_mangle ip6_tables ipv6 dm_multipath kvm_intel kvm
>> snd_hda_codec_analog snd_ens1371 gameport snd_rawmidi snd_ac97_codec
>> snd_hda_intel ac97_bus snd_hda_codec snd_hwdep snd_seq gspca_spca505
>> snd_seq_device gspca_main snd_pcm videodev snd_timer snd v4l1_compat
>> v4l2_compat_ioctl32 firewire_ohci soundcore snd_page_alloc iTCO_wdt
>> i2c_i801 iTCO_vendor_support firewire_core crc_itu_t sky2 pcspkr wmi
>> asus_atk0110 hwmon fbcon tileblit font bitblit softcursor raid456
>> async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx
>> raid1 ata_generic pata_acpi pata_marvell nouveau ttm drm_kms_helper drm
>> agpgart fb i2c_algo_bit cfbcopyarea i2c_core cfbimgblt cfbfil
>> Jan 9 16:36:50 mail kernel: lrect [last unloaded: scsi_wait_scan]
>> Jan 9 16:36:50 mail kernel: Pid: 5271, comm: libvirtd Not tainted
>> 2.6.33-rc3WITHMMAPNODMAR-00147-g3c8ad49-dirty #1
>> Jan 9 16:36:50 mail kernel: Call Trace:
>> Jan 9 16:36:50 mail kernel: <IRQ> [<ffffffff81049fe5>]
>> warn_slowpath_common+0x7c/0x94
>> Jan 9 16:36:50 mail kernel: [<ffffffff8104a054>]
>> warn_slowpath_fmt+0x41/0x43
>> Jan 9 16:36:50 mail kernel: [<ffffffff81261c0a>] check_sync+0xbd/0x426
>> Jan 9 16:36:50 mail kernel: [<ffffffff813b2aff>] ?
>> __netdev_alloc_skb+0x34/0x50
>> Jan 9 16:36:50 mail kernel: [<ffffffff812622c6>]
>> debug_dma_sync_single_for_cpu+0x42/0x44
>> Jan 9 16:36:50 mail kernel: [<ffffffff8125f6c7>] ?
>> swiotlb_sync_single+0x2a/0xb6
>> Jan 9 16:36:50 mail kernel: [<ffffffff8125f823>] ?
>> swiotlb_sync_single_for_cpu+0xc/0xe
>> Jan 9 16:36:50 mail kernel: [<ffffffffa018efcb>] sky2_poll+0x4d5/0xaf0
>> [sky2]
>> Jan 9 16:36:50 mail kernel: [<ffffffff8106a1a3>] ?
>> sched_clock_cpu+0x44/0xce
>> Jan 9 16:36:50 mail kernel: [<ffffffff81070573>] ?
>> clockevents_program_event+0x7a/0x83
>> Jan 9 16:36:50 mail kernel: [<ffffffff813b9766>] net_rx_action+0xb5/0x1f0
>> Jan 9 16:36:50 mail kernel: [<ffffffff8105059c>] __do_softirq+0xf8/0x1cd
>> Jan 9 16:36:50 mail kernel: [<ffffffff8109389a>] ?
>> handle_IRQ_event+0x119/0x12b
>> Jan 9 16:36:50 mail kernel: [<ffffffff8100ab1c>] call_softirq+0x1c/0x30
>> Jan 9 16:36:50 mail kernel: [<ffffffff8100c2b3>] do_softirq+0x4b/0xa3
>> Jan 9 16:36:50 mail kernel: [<ffffffff81050188>] irq_exit+0x4a/0x8c
>> Jan 9 16:36:50 mail kernel: [<ffffffff8145a83c>] do_IRQ+0xac/0xc3
>> Jan 9 16:36:50 mail kernel: [<ffffffff81455a93>] ret_from_intr+0x0/0x16
>> Jan 9 16:36:50 mail kernel: <EOI> [<ffffffff8104474c>] ?
>> set_cpus_allowed_ptr+0x22/0x14b
>> Jan 9 16:36:50 mail kernel: [<ffffffff81087aff>]
>> cpuset_attach_task+0x27/0x9c
>> Jan 9 16:36:50 mail kernel: [<ffffffff81087bfe>] cpuset_attach+0x8a/0x133
>> Jan 9 16:36:50 mail kernel: [<ffffffff81042cba>] ?
>> sched_move_task+0x104/0x110
>> Jan 9 16:36:50 mail kernel: [<ffffffff81085b4f>]
>> cgroup_attach_task+0x4d5/0x533
>> Jan 9 16:36:50 mail kernel: [<ffffffff81085e05>] cgroup_clone+0x258/0x2ac
>> Jan 9 16:36:50 mail kernel: [<ffffffff81088a74>]
>> ns_cgroup_clone+0x58/0x75
>> Jan 9 16:36:50 mail kernel: [<ffffffff81048ec1>]
>> copy_process+0xcef/0x13af
>> Jan 9 16:36:50 mail kernel: [<ffffffff810d9044>] ?
>> handle_mm_fault+0x355/0x7ff
>> Jan 9 16:36:50 mail kernel: [<ffffffff8108f769>] ?
>> audit_filter_rules+0x19a/0x7c5
>> Jan 9 16:36:50 mail kernel: [<ffffffff810496ec>] do_fork+0x16b/0x309
>> Jan 9 16:36:50 mail kernel: [<ffffffff81251d12>] ? __up_read+0x82/0x8a
>> Jan 9 16:36:50 mail kernel: [<ffffffff81010f22>] sys_clone+0x28/0x2a
>> Jan 9 16:36:50 mail kernel: [<ffffffff81009f33>] stub_clone+0x13/0x20
>> Jan 9 16:36:50 mail kernel: [<ffffffff81009bf2>] ?
>> system_call_fastpath+0x16/0x1b
>> Jan 9 16:36:50 mail kernel: ---[ end trace cd5e0588bad4ec83 ]---
>> Then... after a few more normal boot messages (samba starting up, etc.) I
>> just see rcu stalls with NMI backtraces for each cpu. I've attached the
>> first one - the rcu stall oops repeats until the reboot I forced.
> Tracked this down to libvirtd. No idea why yet - but these oops occur when
> starting libvirtd. Version of libvirt is 0.7.0-15.fc12.x86_64.

RCU stall warnings are usually due to an infinite loop somewhere in the
kernel. If you are running !CONFIG_PREEMPT, then any infinite loop not
containing some call to schedule will get you a stall warning. If you
are running CONFIG_PREEMPT, then the infinite loop is in some section of
code with preemption disabled (or irqs disabled).

The stall-warning dump will normally finger one or more of the CPUs.
Since you are getting repeated warnings, look at the stacks and see
which of the most-recently-called functions stays the same in successive
stack traces. This information should help you finger the infinite (or
longer than average) loop.

> Also, checking back to 2.6.32 - found that the sky2 oops listed above also
> occurs (started it seems after an update to
> libvirt-java-0.4.0-1.fc12.noarch two days ago). However the subsequent rcu
> stall doesn't happen on 2.6.32 - system behaves normally (which is why I
> missed the oops).
> Now running OK on 2.6.33 w/o libvirtd.

Then if looking at the stack traces doesn't locate the offending loop,
bisection might help.

Thanx, Paul

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2010-01-13 18:44:29

by Michael Breuer

[permalink] [raw]

Subject: 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.

[Originally posted as: "Re: 2.6.33RC3 libvirtd ->sky2 & rcu oops (was
Sky2 oops - Driver tries to sync DMA memory it has not allocated)"]

On 1/11/2010 8:49 PM, Paul E. McKenney wrote:
> On Sun, Jan 10, 2010 at 03:10:03PM -0500, Michael Breuer wrote:
>
>> On 1/9/2010 5:21 PM, Michael Breuer wrote:
>>
>>> Hi,
>>>
>>> Attempting to move back to mainline after my recent 2.6.32 issues...
>>> Config is make oldconfig from working 2.6.32 config. Patch for af_packet.c
>>> (for skb issue found in 2.6.32) included. Attaching .config and NMI
>>> backtraces.
>>>
>>> System becomes unusable after bringing up the network:
>>>
>>> ...
> RCU stall warnings are usually due to an infinite loop somewhere in the
> kernel. If you are running !CONFIG_PREEMPT, then any infinite loop not
> containing some call to schedule will get you a stall warning. If you
> are running CONFIG_PREEMPT, then the infinite loop is in some section of
> code with preemption disabled (or irqs disabled).
>
> The stall-warning dump will normally finger one or more of the CPUs.
> Since you are getting repeated warnings, look at the stacks and see
> which of the most-recently-called functions stays the same in successive
> stack traces. This information should help you finger the infinite (or
> longer than average) loop.
> ...
>
I can now recreate this simply by "service start libvirtd" on an F12
box. My earlier report that suggested this had something to do with the
sky2 driver was incorrect. Interestingly, it's always CPU1 whenever I
start libvirtd.
Attaching two of the traces (I've got about ten, but they're all pretty
much the same). Looks pretty consistent - libvirtd in CPU1 is hung
forking. Not sure why yet - perhaps someone who knows this better than I
can jump in.
Summary of hang appears to be libvirtd forks - two threads show with
same pid deadlocked on a spin_lock
> Then if looking at the stack traces doesn't locate the offending loop,
> bisection might help.
>
It would, however it's going to be really difficult as I wasn't able to
get this far with rc1 & rc2 :(
> Thanx, Paul
>
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>

Attachments:

stall1 (33.99 kB)
stall2 (34.79 kB)
Download all attachments

2010-01-13 18:58:42

by Paul E. McKenney

[permalink] [raw]

Subject: Re: 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.

On Wed, Jan 13, 2010 at 01:43:45PM -0500, Michael Breuer wrote:
> [Originally posted as: "Re: 2.6.33RC3 libvirtd ->sky2 & rcu oops (was Sky2
> oops - Driver tries to sync DMA memory it has not allocated)"]
>
> On 1/11/2010 8:49 PM, Paul E. McKenney wrote:
>> On Sun, Jan 10, 2010 at 03:10:03PM -0500, Michael Breuer wrote:
>>
>>> On 1/9/2010 5:21 PM, Michael Breuer wrote:
>>>
>>>> Hi,
>>>>
>>>> Attempting to move back to mainline after my recent 2.6.32 issues...
>>>> Config is make oldconfig from working 2.6.32 config. Patch for
>>>> af_packet.c
>>>> (for skb issue found in 2.6.32) included. Attaching .config and NMI
>>>> backtraces.
>>>>
>>>> System becomes unusable after bringing up the network:
>>>>
>>>> ...
>> RCU stall warnings are usually due to an infinite loop somewhere in the
>> kernel. If you are running !CONFIG_PREEMPT, then any infinite loop not
>> containing some call to schedule will get you a stall warning. If you
>> are running CONFIG_PREEMPT, then the infinite loop is in some section of
>> code with preemption disabled (or irqs disabled).
>>
>> The stall-warning dump will normally finger one or more of the CPUs.
>> Since you are getting repeated warnings, look at the stacks and see
>> which of the most-recently-called functions stays the same in successive
>> stack traces. This information should help you finger the infinite (or
>> longer than average) loop.
>> ...
>>
> I can now recreate this simply by "service start libvirtd" on an F12 box.
> My earlier report that suggested this had something to do with the sky2
> driver was incorrect. Interestingly, it's always CPU1 whenever I start
> libvirtd.
> Attaching two of the traces (I've got about ten, but they're all pretty
> much the same). Looks pretty consistent - libvirtd in CPU1 is hung forking.
> Not sure why yet - perhaps someone who knows this better than I can jump
> in.
> Summary of hang appears to be libvirtd forks - two threads show with same
> pid deadlocked on a spin_lock
>> Then if looking at the stack traces doesn't locate the offending loop,
>> bisection might help.
>>
> It would, however it's going to be really difficult as I wasn't able to get
> this far with rc1 & rc2 :(

I must defer to others on libvirtd -- perhaps fixing that problem will
also address the RCU CPU stall. Hey, I can dream! ;-)

Thanx, Paul

>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>

> Jan 13 12:59:25 mail kernel: INFO: RCU detected CPU 1 stall (t=10000 jiffies)
> Jan 13 12:59:25 mail kernel: sending NMI to all CPUs:
> Jan 13 12:59:25 mail kernel: NMI backtrace for cpu 4
> Jan 13 12:59:25 mail kernel: CPU 4
> Jan 13 12:59:25 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:25 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: RSP: 0018:ffff8803325b5e38 EFLAGS: 00000046
> Jan 13 12:59:25 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:25 mail kernel: RDX: 0000000000000000 RSI: ffff8803325b5fd8 RDI: 0000000000000003
> Jan 13 12:59:25 mail kernel: RBP: ffff8803325b5e48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:25 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:25 mail kernel: R13: 118884aafc2a90f9 R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:25 mail kernel: FS: 0000000000000000(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
> Jan 13 12:59:25 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:25 mail kernel: CR2: 0000003a542a5010 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:25 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:25 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:25 mail kernel: Process swapper (pid: 0, threadinfo ffff8803325b4000, task ffff8803325b8000)
> Jan 13 12:59:25 mail kernel: Stack:
> Jan 13 12:59:25 mail kernel: ffff88032eb285a8 ffff88032eb28000 ffff8803325b5e58 ffffffff8101e23e
> Jan 13 12:59:25 mail kernel: <0> ffff8803325b5e78 ffffffff812ab088 000000004b4e09fd ffff88032eb285a8
> Jan 13 12:59:25 mail kernel: <0> ffff8803325b5ed8 ffffffff812ab348 0000000000000000 0000000000000322
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:25 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:25 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:25 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:25 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: NMI backtrace for cpu 1
> Jan 13 12:59:25 mail kernel: CPU 1
> Jan 13 12:59:25 mail kernel: Pid: 7809, comm: libvirtd Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:25 mail kernel: RIP: 0010:[<ffffffff812560d8>] [<ffffffff812560d8>] __const_udelay+0x37/0x44
> Jan 13 12:59:25 mail kernel: RSP: 0018:ffff880028223dc8 EFLAGS: 00000002
> Jan 13 12:59:25 mail kernel: RAX: 0000000001062560 RBX: 0000000000000001 RCX: ffff880028220000
> Jan 13 12:59:25 mail kernel: RDX: 0000000027d22b08 RSI: 0000000000000002 RDI: 0000000000418958
> Jan 13 12:59:25 mail kernel: RBP: ffff880028223dc8 R08: ffff880028223ce8 R09: 0000000000000000
> Jan 13 12:59:25 mail kernel: R10: 0000000000000004 R11: ffff880330e1de00 R12: ffff8800282303f0
> Jan 13 12:59:25 mail kernel: R13: ffffffff81a50600 R14: ffff880028223f48 R15: ffff880028223f48
> Jan 13 12:59:25 mail kernel: FS: 00007fbdf8da27e0(0000) GS:ffff880028220000(0000) knlGS:0000000000000000
> Jan 13 12:59:25 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:25 mail kernel: CR2: 00000000004479d0 CR3: 00000002d525f000 CR4: 00000000000006e0
> Jan 13 12:59:25 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:25 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:25 mail kernel: Process libvirtd (pid: 7809, threadinfo ffff8802a726c000, task ffff8802ba04c680)
> Jan 13 12:59:25 mail kernel: Stack:
> Jan 13 12:59:25 mail kernel: ffff880028223de8 ffffffff810206eb ffffffff91a24769 ffffffff81a50600
> Jan 13 12:59:25 mail kernel: <0> ffff880028223e38 ffffffff8109814b 00000000000103c0 ffff8802a726da38
> Jan 13 12:59:25 mail kernel: <0> ffff880028223e58 0000000000000001 0000000000000001 0000000000000000
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <IRQ>
> Jan 13 12:59:25 mail kernel: [<ffffffff810206eb>] arch_trigger_all_cpu_backtrace+0x57/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff8109814b>] __rcu_pending+0x6e/0x2a5
> Jan 13 12:59:25 mail kernel: [<ffffffff810983b7>] rcu_check_callbacks+0x35/0x10a
> Jan 13 12:59:25 mail kernel: [<ffffffff81057b37>] update_process_times+0x41/0x5c
> Jan 13 12:59:25 mail kernel: [<ffffffff81071cba>] tick_sched_timer+0x77/0xa0
> Jan 13 12:59:25 mail kernel: [<ffffffff81067f08>] __run_hrtimer+0xb8/0x117
> Jan 13 12:59:25 mail kernel: [<ffffffff81071c43>] ? tick_sched_timer+0x0/0xa0
> Jan 13 12:59:25 mail kernel: [<ffffffff81068187>] hrtimer_interrupt+0xc7/0x1b5
> Jan 13 12:59:25 mail kernel: [<ffffffff814575c6>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> Jan 13 12:59:25 mail kernel: [<ffffffff8145cb14>] smp_apic_timer_interrupt+0x81/0x94
> Jan 13 12:59:25 mail kernel: [<ffffffff8100a5d3>] apic_timer_interrupt+0x13/0x20
> Jan 13 12:59:25 mail kernel: <EOI>
> Jan 13 12:59:25 mail kernel: [<ffffffff810447be>] ? set_cpus_allowed_ptr+0x22/0x14b
> Jan 13 12:59:25 mail kernel: [<ffffffff810f2a7b>] ? spin_lock+0xe/0x10
> Jan 13 12:59:25 mail kernel: [<ffffffff81087d79>] cpuset_attach_task+0x27/0x9b
> Jan 13 12:59:25 mail kernel: [<ffffffff81087e77>] cpuset_attach+0x8a/0x133
> Jan 13 12:59:25 mail kernel: [<ffffffff81042d2c>] ? sched_move_task+0x104/0x110
> Jan 13 12:59:25 mail kernel: [<ffffffff81085dbd>] cgroup_attach_task+0x4e1/0x53f
> Jan 13 12:59:25 mail kernel: [<ffffffff81084f48>] ? cgroup_populate_dir+0x77/0xff
> Jan 13 12:59:25 mail kernel: [<ffffffff81086073>] cgroup_clone+0x258/0x2ac
> Jan 13 12:59:25 mail kernel: [<ffffffff81088d04>] ns_cgroup_clone+0x58/0x75
> Jan 13 12:59:25 mail kernel: [<ffffffff81048f3d>] copy_process+0xcef/0x13af
> Jan 13 12:59:25 mail kernel: [<ffffffff810d963c>] ? handle_mm_fault+0x355/0x7ff
> Jan 13 12:59:25 mail kernel: [<ffffffff81049768>] do_fork+0x16b/0x309
> Jan 13 12:59:25 mail kernel: [<ffffffff81252ab2>] ? __up_read+0x8e/0x97
> Jan 13 12:59:25 mail kernel: [<ffffffff81068c92>] ? up_read+0xe/0x10
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a779>] ? do_page_fault+0x280/0x2cc
> Jan 13 12:59:25 mail kernel: [<ffffffff81010f2e>] sys_clone+0x28/0x2a
> Jan 13 12:59:25 mail kernel: [<ffffffff81009f33>] stub_clone+0x13/0x20
> Jan 13 12:59:25 mail kernel: [<ffffffff81009bf2>] ? system_call_fastpath+0x16/0x1b
> Jan 13 12:59:25 mail kernel: Code: 80 56 01 00 48 8d 04 bd 00 00 00 00 65 8b 0c 25 38 e3 00 00 48 63 c9 48 8b 0c cd 20 fa ad 81 48 69 94 0a 98 00 00 00 fa 00 00 00 <f7> e2 48 8d 7a 01 e8 ad ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <#DB[1]> <<EOE>> Pid: 7809, comm: libvirtd Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:25 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:25 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:25 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:25 mail kernel: [<ffffffff812560d8>] ? __const_udelay+0x37/0x44
> Jan 13 12:59:25 mail kernel: <<EOE>> <IRQ> [<ffffffff810206eb>] arch_trigger_all_cpu_backtrace+0x57/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff8109814b>] __rcu_pending+0x6e/0x2a5
> Jan 13 12:59:25 mail kernel: [<ffffffff810983b7>] rcu_check_callbacks+0x35/0x10a
> Jan 13 12:59:25 mail kernel: [<ffffffff81057b37>] update_process_times+0x41/0x5c
> Jan 13 12:59:25 mail kernel: [<ffffffff81071cba>] tick_sched_timer+0x77/0xa0
> Jan 13 12:59:25 mail kernel: [<ffffffff81067f08>] __run_hrtimer+0xb8/0x117
> Jan 13 12:59:25 mail kernel: [<ffffffff81071c43>] ? tick_sched_timer+0x0/0xa0
> Jan 13 12:59:25 mail kernel: [<ffffffff81068187>] hrtimer_interrupt+0xc7/0x1b5
> Jan 13 12:59:25 mail kernel: [<ffffffff814575c6>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> Jan 13 12:59:25 mail kernel: [<ffffffff8145cb14>] smp_apic_timer_interrupt+0x81/0x94
> Jan 13 12:59:25 mail kernel: [<ffffffff8100a5d3>] apic_timer_interrupt+0x13/0x20
> Jan 13 12:59:25 mail kernel: <EOI> [<ffffffff810447be>] ? set_cpus_allowed_ptr+0x22/0x14b
> Jan 13 12:59:25 mail kernel: [<ffffffff810f2a7b>] ? spin_lock+0xe/0x10
> Jan 13 12:59:25 mail kernel: [<ffffffff81087d79>] cpuset_attach_task+0x27/0x9b
> Jan 13 12:59:25 mail kernel: [<ffffffff81087e77>] cpuset_attach+0x8a/0x133
> Jan 13 12:59:25 mail kernel: [<ffffffff81042d2c>] ? sched_move_task+0x104/0x110
> Jan 13 12:59:25 mail kernel: [<ffffffff81085dbd>] cgroup_attach_task+0x4e1/0x53f
> Jan 13 12:59:25 mail kernel: [<ffffffff81084f48>] ? cgroup_populate_dir+0x77/0xff
> Jan 13 12:59:25 mail kernel: [<ffffffff81086073>] cgroup_clone+0x258/0x2ac
> Jan 13 12:59:25 mail kernel: [<ffffffff81088d04>] ns_cgroup_clone+0x58/0x75
> Jan 13 12:59:25 mail kernel: [<ffffffff81048f3d>] copy_process+0xcef/0x13af
> Jan 13 12:59:25 mail kernel: [<ffffffff810d963c>] ? handle_mm_fault+0x355/0x7ff
> Jan 13 12:59:25 mail kernel: [<ffffffff81049768>] do_fork+0x16b/0x309
> Jan 13 12:59:25 mail kernel: [<ffffffff81252ab2>] ? __up_read+0x8e/0x97
> Jan 13 12:59:25 mail kernel: [<ffffffff81068c92>] ? up_read+0xe/0x10
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a779>] ? do_page_fault+0x280/0x2cc
> Jan 13 12:59:25 mail kernel: [<ffffffff81010f2e>] sys_clone+0x28/0x2a
> Jan 13 12:59:25 mail kernel: [<ffffffff81009f33>] stub_clone+0x13/0x20
> Jan 13 12:59:25 mail kernel: [<ffffffff81009bf2>] ? system_call_fastpath+0x16/0x1b
> Jan 13 12:59:25 mail kernel: NMI backtrace for cpu 6
> Jan 13 12:59:25 mail kernel: CPU 6
> Jan 13 12:59:25 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:25 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: RSP: 0018:ffff8803325e5e38 EFLAGS: 00000046
> Jan 13 12:59:25 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:25 mail kernel: RDX: 0000000000000000 RSI: ffff8803325e5fd8 RDI: 0000000000000003
> Jan 13 12:59:25 mail kernel: RBP: ffff8803325e5e48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:25 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:25 mail kernel: R13: 118884aafc2a9d4e R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:25 mail kernel: FS: 0000000000000000(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000
> Jan 13 12:59:25 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:25 mail kernel: CR2: 00007fe9aa1b6000 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:25 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:25 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:25 mail kernel: Process swapper (pid: 0, threadinfo ffff8803325e4000, task ffff8803325dc680)
> Jan 13 12:59:25 mail kernel: Stack:
> Jan 13 12:59:25 mail kernel: ffff88032eb2a5a8 ffff88032eb2a000 ffff8803325e5e58 ffffffff8101e23e
> Jan 13 12:59:25 mail kernel: <0> ffff8803325e5e78 ffffffff812ab088 000000004b4e09fd ffff88032eb2a5a8
> Jan 13 12:59:25 mail kernel: <0> ffff8803325e5ed8 ffffffff812ab348 0000000000000000 0000000000000314
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:25 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:25 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:25 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:25 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: NMI backtrace for cpu 2
> Jan 13 12:59:25 mail kernel: CPU 2
> Jan 13 12:59:25 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:25 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: RSP: 0018:ffff88033256de38 EFLAGS: 00000046
> Jan 13 12:59:25 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:25 mail kernel: RDX: 0000000000000000 RSI: ffff88033256dfd8 RDI: 0000000000000003
> Jan 13 12:59:25 mail kernel: RBP: ffff88033256de48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:25 mail kernel: R10: 0000000000000000 R11: 0000000000000400 R12: 0000000000000001
> Jan 13 12:59:25 mail kernel: R13: 118884aafc2ac0cf R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:25 mail kernel: FS: 0000000000000000(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
> Jan 13 12:59:25 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:25 mail kernel: CR2: 00007f15e00ed000 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:25 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:25 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:25 mail kernel: Process swapper (pid: 0, threadinfo ffff88033256c000, task ffff880332562f00)
> Jan 13 12:59:25 mail kernel: Stack:
> Jan 13 12:59:25 mail kernel: ffff880331e5e5a8 ffff880331e5e000 ffff88033256de58 ffffffff8101e23e
> Jan 13 12:59:25 mail kernel: <0> ffff88033256de78 ffffffff812ab088 000000004b4e09fd ffff880331e5e5a8
> Jan 13 12:59:25 mail kernel: <0> ffff88033256ded8 ffffffff812ab348 0000000000000000 0000000000000318
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:25 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:25 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:25 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:25 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: NMI backtrace for cpu 3
> Jan 13 12:59:25 mail kernel: CPU 3
> Jan 13 12:59:25 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:25 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: RSP: 0018:ffff88033259fe38 EFLAGS: 00000046
> Jan 13 12:59:25 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:25 mail kernel: RDX: 0000000000000000 RSI: ffff88033259ffd8 RDI: 0000000000000003
> Jan 13 12:59:25 mail kernel: RBP: ffff88033259fe48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:25 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:25 mail kernel: R13: 118884aafc2ab4ba R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:25 mail kernel: FS: 0000000000000000(0000) GS:ffff880028260000(0000) knlGS:0000000000000000
> Jan 13 12:59:25 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:25 mail kernel: CR2: 00007fc73f1c7000 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:25 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:25 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:25 mail kernel: Process swapper (pid: 0, threadinfo ffff88033259e000, task ffff880332579780)
> Jan 13 12:59:25 mail kernel: Stack:
> Jan 13 12:59:25 mail kernel: ffff880331e5f5a8 ffff880331e5f000 ffff88033259fe58 ffffffff8101e23e
> Jan 13 12:59:25 mail kernel: <0> ffff88033259fe78 ffffffff812ab088 000000004b4e09fd ffff880331e5f5a8
> Jan 13 12:59:25 mail kernel: <0> ffff88033259fed8 ffffffff812ab348 0000000000000000 000000000000031d
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:25 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:25 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:25 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:25 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: NMI backtrace for cpu 7
> Jan 13 12:59:25 mail kernel: CPU 7
> Jan 13 12:59:25 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:25 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: RSP: 0018:ffff880331c01e38 EFLAGS: 00000046
> Jan 13 12:59:25 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:25 mail kernel: RDX: 0000000000000000 RSI: ffff880331c01fd8 RDI: 0000000000000003
> Jan 13 12:59:25 mail kernel: RBP: ffff880331c01e48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:25 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:25 mail kernel: R13: 118884aafc2a841f R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:25 mail kernel: FS: 0000000000000000(0000) GS:ffff8800282e0000(0000) knlGS:0000000000000000
> Jan 13 12:59:25 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:25 mail kernel: CR2: 0000003a54576190 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:25 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:25 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:25 mail kernel: Process swapper (pid: 0, threadinfo ffff880331c00000, task ffff8803325f2f00)
> Jan 13 12:59:25 mail kernel: Stack:
> Jan 13 12:59:25 mail kernel: ffff88032eb2b5a8 ffff88032eb2b000 ffff880331c01e58 ffffffff8101e23e
> Jan 13 12:59:25 mail kernel: <0> ffff880331c01e78 ffffffff812ab088 000000004b4e09fd ffff88032eb2b5a8
> Jan 13 12:59:25 mail kernel: <0> ffff880331c01ed8 ffffffff812ab348 0000000000000000 0000000000000307
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:25 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:25 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:25 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:25 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: NMI backtrace for cpu 0
> Jan 13 12:59:25 mail kernel: CPU 0
> Jan 13 12:59:25 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:25 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: RSP: 0018:ffffffff81a01e28 EFLAGS: 00000046
> Jan 13 12:59:25 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:25 mail kernel: RDX: 0000000000000000 RSI: ffffffff81a01fd8 RDI: 0000000000000003
> Jan 13 12:59:25 mail kernel: RBP: ffffffff81a01e38 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:25 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:25 mail kernel: R13: 118884aafc2ad623 R14: ffffffffffffffff R15: 00000000000936b0
> Jan 13 12:59:25 mail kernel: FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
> Jan 13 12:59:25 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:25 mail kernel: CR2: 000000000243f018 CR3: 0000000001a34000 CR4: 00000000000006f0
> Jan 13 12:59:25 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:25 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:25 mail kernel: Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a3c020)
> Jan 13 12:59:25 mail kernel: Stack:
> Jan 13 12:59:25 mail kernel: ffff880331e5c5a8 ffff880331e5c000 ffffffff81a01e48 ffffffff8101e23e
> Jan 13 12:59:25 mail kernel: <0> ffffffff81a01e68 ffffffff812ab088 000000004b4e09fd ffff880331e5c5a8
> Jan 13 12:59:25 mail kernel: <0> ffffffff81a01ec8 ffffffff812ab348 0000000000000000 0000000000000317
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff8144485a>] rest_init+0x7e/0x80
> Jan 13 12:59:25 mail kernel: [<ffffffff81b00d83>] start_kernel+0x427/0x432
> Jan 13 12:59:25 mail kernel: [<ffffffff81b002bc>] x86_64_start_reservations+0xa7/0xab
> Jan 13 12:59:25 mail kernel: [<ffffffff81b003b8>] x86_64_start_kernel+0xf8/0x107
> Jan 13 12:59:25 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:25 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:25 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:25 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:25 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff8144485a>] rest_init+0x7e/0x80
> Jan 13 12:59:25 mail kernel: [<ffffffff81b00d83>] start_kernel+0x427/0x432
> Jan 13 12:59:25 mail kernel: [<ffffffff81b002bc>] x86_64_start_reservations+0xa7/0xab
> Jan 13 12:59:25 mail kernel: [<ffffffff81b003b8>] x86_64_start_kernel+0xf8/0x107
> Jan 13 12:59:25 mail kernel: NMI backtrace for cpu 5
> Jan 13 12:59:25 mail kernel: CPU 5
> Jan 13 12:59:25 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:25 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: RSP: 0018:ffff8803325d1e38 EFLAGS: 00000046
> Jan 13 12:59:25 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:25 mail kernel: RDX: 0000000000000000 RSI: ffff8803325d1fd8 RDI: 0000000000000003
> Jan 13 12:59:25 mail kernel: RBP: ffff8803325d1e48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:25 mail kernel: R10: 0000000000000000 R11: 00000000000005fe R12: 0000000000000001
> Jan 13 12:59:25 mail kernel: R13: 118884aafc2ad4cd R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:25 mail kernel: FS: 0000000000000000(0000) GS:ffff8800282a0000(0000) knlGS:0000000000000000
> Jan 13 12:59:25 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:25 mail kernel: CR2: 00007fe7a12a8000 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:25 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:25 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:25 mail kernel: Process swapper (pid: 0, threadinfo ffff8803325d0000, task ffff8803325bde00)
> Jan 13 12:59:25 mail kernel: Stack:
> Jan 13 12:59:25 mail kernel: ffff88032eb295a8 ffff88032eb29000 ffff8803325d1e58 ffffffff8101e23e
> Jan 13 12:59:25 mail kernel: <0> ffff8803325d1e78 ffffffff812ab088 000000004b4e09fd ffff88032eb295a8
> Jan 13 12:59:25 mail kernel: <0> ffff8803325d1ed8 ffffffff812ab348 0000000000000000 0000000000000313
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:25 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:25 mail kernel: Call Trace:
> Jan 13 12:59:25 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:25 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:25 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:25 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:25 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:25 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:25 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:25 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:25 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:25 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242

> Jan 13 12:59:55 mail kernel: INFO: RCU detected CPU 1 stall (t=40000 jiffies)
> Jan 13 12:59:55 mail kernel: sending NMI to all CPUs:
> Jan 13 12:59:55 mail kernel: NMI backtrace for cpu 4
> Jan 13 12:59:55 mail kernel: CPU 4
> Jan 13 12:59:55 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:55 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: RSP: 0018:ffff8803325b5e38 EFLAGS: 00000046
> Jan 13 12:59:55 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:55 mail kernel: RDX: 0000000000000000 RSI: ffff8803325b5fd8 RDI: 0000000000000003
> Jan 13 12:59:55 mail kernel: RBP: ffff8803325b5e48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:55 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:55 mail kernel: R13: 118884b1f84e3f85 R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:55 mail kernel: FS: 0000000000000000(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
> Jan 13 12:59:55 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:55 mail kernel: CR2: 00007fee47bf8d7f CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:55 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:55 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:55 mail kernel: Process swapper (pid: 0, threadinfo ffff8803325b4000, task ffff8803325b8000)
> Jan 13 12:59:55 mail kernel: Stack:
> Jan 13 12:59:55 mail kernel: ffff88032eb285a8 ffff88032eb28000 ffff8803325b5e58 ffffffff8101e23e
> Jan 13 12:59:55 mail kernel: <0> ffff8803325b5e78 ffffffff812ab088 000000004b4e0a1b ffff88032eb285a8
> Jan 13 12:59:55 mail kernel: <0> ffff8803325b5ed8 ffffffff812ab348 0000000000000000 0000000000000318
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:55 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:55 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:55 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:55 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: NMI backtrace for cpu 6
> Jan 13 12:59:55 mail kernel: CPU 6
> Jan 13 12:59:55 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:55 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: RSP: 0018:ffff8803325e5e38 EFLAGS: 00000046
> Jan 13 12:59:55 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:55 mail kernel: RDX: 0000000000000000 RSI: ffff8803325e5fd8 RDI: 0000000000000003
> Jan 13 12:59:55 mail kernel: RBP: ffff8803325e5e48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:55 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:55 mail kernel: R13: 118884b1f84e4c6d R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:55 mail kernel: FS: 0000000000000000(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000
> Jan 13 12:59:55 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:55 mail kernel: CR2: 00007fa18eb54000 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:55 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:55 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:55 mail kernel: Process swapper (pid: 0, threadinfo ffff8803325e4000, task ffff8803325dc680)
> Jan 13 12:59:55 mail kernel: Stack:
> Jan 13 12:59:55 mail kernel: ffff88032eb2a5a8 ffff88032eb2a000 ffff8803325e5e58 ffffffff8101e23e
> Jan 13 12:59:55 mail kernel: <0> ffff8803325e5e78 ffffffff812ab088 000000004b4e0a1b ffff88032eb2a5a8
> Jan 13 12:59:55 mail kernel: <0> ffff8803325e5ed8 ffffffff812ab348 0000000000000000 0000000000000319
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:55 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:55 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:55 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:55 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: NMI backtrace for cpu 2
> Jan 13 12:59:55 mail kernel: CPU 2
> Jan 13 12:59:55 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:55 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: RSP: 0018:ffff88033256de38 EFLAGS: 00000046
> Jan 13 12:59:55 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:55 mail kernel: RDX: 0000000000000000 RSI: ffff88033256dfd8 RDI: 0000000000000003
> Jan 13 12:59:55 mail kernel: RBP: ffff88033256de48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:55 mail kernel: R10: 0000000000000000 R11: 00000000000003fe R12: 0000000000000001
> Jan 13 12:59:55 mail kernel: R13: 118884b1f84e6f4a R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:55 mail kernel: FS: 0000000000000000(0000) GS:ffff880028240000(0000) knlGS:0000000000000000
> Jan 13 12:59:55 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:55 mail kernel: CR2: 00007fe9aa1b6000 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:55 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:55 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:55 mail kernel: Process swapper (pid: 0, threadinfo ffff88033256c000, task ffff880332562f00)
> Jan 13 12:59:55 mail kernel: Stack:
> Jan 13 12:59:55 mail kernel: ffff880331e5e5a8 ffff880331e5e000 ffff88033256de58 ffffffff8101e23e
> Jan 13 12:59:55 mail kernel: <0> ffff88033256de78 ffffffff812ab088 000000004b4e0a1b ffff880331e5e5a8
> Jan 13 12:59:55 mail kernel: <0> ffff88033256ded8 ffffffff812ab348 0000000000000000 000000000000031b
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:55 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:55 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:55 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:55 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: NMI backtrace for cpu 1
> Jan 13 12:59:55 mail kernel: CPU 1
> Jan 13 12:59:55 mail kernel: Pid: 7809, comm: libvirtd Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:55 mail kernel: RIP: 0010:[<ffffffff8101096b>] [<ffffffff8101096b>] native_read_tsc+0x6/0x16
> Jan 13 12:59:55 mail kernel: RSP: 0018:ffff880028223d78 EFLAGS: 00000006
> Jan 13 12:59:55 mail kernel: RAX: 000000009330ba6c RBX: ffffffff9330ba30 RCX: 000000009330ba30
> Jan 13 12:59:55 mail kernel: RDX: 0000000000000247 RSI: 0000000000000002 RDI: 000000000028c6e9
> Jan 13 12:59:55 mail kernel: RBP: ffff880028223d78 R08: ffff880028223ce8 R09: 0000000000000000
> Jan 13 12:59:55 mail kernel: R10: 0000000000000004 R11: ffff880330e1de00 R12: 000000000028c6e9
> Jan 13 12:59:55 mail kernel: R13: 0000000000000001 R14: ffff880028223f48 R15: ffff880028223f48
> Jan 13 12:59:55 mail kernel: FS: 00007fbdf8da27e0(0000) GS:ffff880028220000(0000) knlGS:0000000000000000
> Jan 13 12:59:55 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:55 mail kernel: CR2: 00000000004479d0 CR3: 00000002d525f000 CR4: 00000000000006e0
> Jan 13 12:59:55 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:55 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:55 mail kernel: Process libvirtd (pid: 7809, threadinfo ffff8802a726c000, task ffff8802ba04c680)
> Jan 13 12:59:55 mail kernel: Stack:
> Jan 13 12:59:55 mail kernel: ffff880028223da8 ffffffff81256147 0000000000000001 ffff8800282303f0
> Jan 13 12:59:55 mail kernel: <0> ffffffff81a50600 ffff880028223f48 ffff880028223db8 ffffffff8125609f
> Jan 13 12:59:55 mail kernel: <0> ffff880028223dc8 ffffffff812560e3 ffff880028223de8 ffffffff810206eb
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <IRQ>
> Jan 13 12:59:55 mail kernel: [<ffffffff81256147>] delay_tsc+0x37/0x80
> Jan 13 12:59:55 mail kernel: [<ffffffff8125609f>] __delay+0xf/0x11
> Jan 13 12:59:55 mail kernel: [<ffffffff812560e3>] __const_udelay+0x42/0x44
> Jan 13 12:59:55 mail kernel: [<ffffffff810206eb>] arch_trigger_all_cpu_backtrace+0x57/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff8109814b>] __rcu_pending+0x6e/0x2a5
> Jan 13 12:59:55 mail kernel: [<ffffffff810983b7>] rcu_check_callbacks+0x35/0x10a
> Jan 13 12:59:55 mail kernel: [<ffffffff81057b37>] update_process_times+0x41/0x5c
> Jan 13 12:59:55 mail kernel: [<ffffffff81071cba>] tick_sched_timer+0x77/0xa0
> Jan 13 12:59:55 mail kernel: [<ffffffff81067f08>] __run_hrtimer+0xb8/0x117
> Jan 13 12:59:55 mail kernel: [<ffffffff81071c43>] ? tick_sched_timer+0x0/0xa0
> Jan 13 12:59:55 mail kernel: [<ffffffff81068187>] hrtimer_interrupt+0xc7/0x1b5
> Jan 13 12:59:55 mail kernel: [<ffffffff814575c6>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> Jan 13 12:59:55 mail kernel: [<ffffffff8145cb14>] smp_apic_timer_interrupt+0x81/0x94
> Jan 13 12:59:55 mail kernel: [<ffffffff8100a5d3>] apic_timer_interrupt+0x13/0x20
> Jan 13 12:59:55 mail kernel: <EOI>
> Jan 13 12:59:55 mail kernel: [<ffffffff810447c1>] ? set_cpus_allowed_ptr+0x25/0x14b
> Jan 13 12:59:55 mail kernel: [<ffffffff810f2a7b>] ? spin_lock+0xe/0x10
> Jan 13 12:59:55 mail kernel: [<ffffffff81087d79>] cpuset_attach_task+0x27/0x9b
> Jan 13 12:59:55 mail kernel: [<ffffffff81087e77>] cpuset_attach+0x8a/0x133
> Jan 13 12:59:55 mail kernel: [<ffffffff81042d2c>] ? sched_move_task+0x104/0x110
> Jan 13 12:59:55 mail kernel: [<ffffffff81085dbd>] cgroup_attach_task+0x4e1/0x53f
> Jan 13 12:59:55 mail kernel: [<ffffffff81084f48>] ? cgroup_populate_dir+0x77/0xff
> Jan 13 12:59:55 mail kernel: [<ffffffff81086073>] cgroup_clone+0x258/0x2ac
> Jan 13 12:59:55 mail kernel: [<ffffffff81088d04>] ns_cgroup_clone+0x58/0x75
> Jan 13 12:59:55 mail kernel: [<ffffffff81048f3d>] copy_process+0xcef/0x13af
> Jan 13 12:59:55 mail kernel: [<ffffffff810d963c>] ? handle_mm_fault+0x355/0x7ff
> Jan 13 12:59:55 mail kernel: [<ffffffff81049768>] do_fork+0x16b/0x309
> Jan 13 12:59:55 mail kernel: [<ffffffff81252ab2>] ? __up_read+0x8e/0x97
> Jan 13 12:59:55 mail kernel: [<ffffffff81068c92>] ? up_read+0xe/0x10
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a779>] ? do_page_fault+0x280/0x2cc
> Jan 13 12:59:55 mail kernel: [<ffffffff81010f2e>] sys_clone+0x28/0x2a
> Jan 13 12:59:55 mail kernel: [<ffffffff81009f33>] stub_clone+0x13/0x20
> Jan 13 12:59:55 mail kernel: [<ffffffff81009bf2>] ? system_call_fastpath+0x16/0x1b
> Jan 13 12:59:55 mail kernel: Code: 57 24 00 c9 c3 90 90 90 55 40 88 f8 48 89 e5 e6 70 e4 71 c9 c3 55 40 88 f0 48 89 e5 e6 70 40 88 f8 e6 71 c9 c3 55 48 89 e5 0f 31 <89> c1 c9 48 89 d0 89 c9 48 c1 e0 20 48 09 c8 c3 55 48 89 e5 41
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <#DB[1]> <<EOE>> Pid: 7809, comm: libvirtd Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:55 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:55 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:55 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:55 mail kernel: [<ffffffff8101096b>] ? native_read_tsc+0x6/0x16
> Jan 13 12:59:55 mail kernel: <<EOE>> <IRQ> [<ffffffff81256147>] delay_tsc+0x37/0x80
> Jan 13 12:59:55 mail kernel: [<ffffffff8125609f>] __delay+0xf/0x11
> Jan 13 12:59:55 mail kernel: [<ffffffff812560e3>] __const_udelay+0x42/0x44
> Jan 13 12:59:55 mail kernel: [<ffffffff810206eb>] arch_trigger_all_cpu_backtrace+0x57/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff8109814b>] __rcu_pending+0x6e/0x2a5
> Jan 13 12:59:55 mail kernel: [<ffffffff810983b7>] rcu_check_callbacks+0x35/0x10a
> Jan 13 12:59:55 mail kernel: [<ffffffff81057b37>] update_process_times+0x41/0x5c
> Jan 13 12:59:55 mail kernel: [<ffffffff81071cba>] tick_sched_timer+0x77/0xa0
> Jan 13 12:59:55 mail kernel: [<ffffffff81067f08>] __run_hrtimer+0xb8/0x117
> Jan 13 12:59:55 mail kernel: [<ffffffff81071c43>] ? tick_sched_timer+0x0/0xa0
> Jan 13 12:59:55 mail kernel: [<ffffffff81068187>] hrtimer_interrupt+0xc7/0x1b5
> Jan 13 12:59:55 mail kernel: [<ffffffff814575c6>] ? trace_hardirqs_off_thunk+0x3a/0x6c
> Jan 13 12:59:55 mail kernel: [<ffffffff8145cb14>] smp_apic_timer_interrupt+0x81/0x94
> Jan 13 12:59:55 mail kernel: [<ffffffff8100a5d3>] apic_timer_interrupt+0x13/0x20
> Jan 13 12:59:55 mail kernel: <EOI> [<ffffffff810447c1>] ? set_cpus_allowed_ptr+0x25/0x14b
> Jan 13 12:59:55 mail kernel: [<ffffffff810f2a7b>] ? spin_lock+0xe/0x10
> Jan 13 12:59:55 mail kernel: [<ffffffff81087d79>] cpuset_attach_task+0x27/0x9b
> Jan 13 12:59:55 mail kernel: [<ffffffff81087e77>] cpuset_attach+0x8a/0x133
> Jan 13 12:59:55 mail kernel: [<ffffffff81042d2c>] ? sched_move_task+0x104/0x110
> Jan 13 12:59:55 mail kernel: [<ffffffff81085dbd>] cgroup_attach_task+0x4e1/0x53f
> Jan 13 12:59:55 mail kernel: [<ffffffff81084f48>] ? cgroup_populate_dir+0x77/0xff
> Jan 13 12:59:55 mail kernel: [<ffffffff81086073>] cgroup_clone+0x258/0x2ac
> Jan 13 12:59:55 mail kernel: [<ffffffff81088d04>] ns_cgroup_clone+0x58/0x75
> Jan 13 12:59:55 mail kernel: [<ffffffff81048f3d>] copy_process+0xcef/0x13af
> Jan 13 12:59:55 mail kernel: [<ffffffff810d963c>] ? handle_mm_fault+0x355/0x7ff
> Jan 13 12:59:55 mail kernel: [<ffffffff81049768>] do_fork+0x16b/0x309
> Jan 13 12:59:55 mail kernel: [<ffffffff81252ab2>] ? __up_read+0x8e/0x97
> Jan 13 12:59:55 mail kernel: [<ffffffff81068c92>] ? up_read+0xe/0x10
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a779>] ? do_page_fault+0x280/0x2cc
> Jan 13 12:59:55 mail kernel: [<ffffffff81010f2e>] sys_clone+0x28/0x2a
> Jan 13 12:59:55 mail kernel: [<ffffffff81009f33>] stub_clone+0x13/0x20
> Jan 13 12:59:55 mail kernel: [<ffffffff81009bf2>] ? system_call_fastpath+0x16/0x1b
> Jan 13 12:59:55 mail kernel: NMI backtrace for cpu 7
> Jan 13 12:59:55 mail kernel: CPU 7
> Jan 13 12:59:55 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:55 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: RSP: 0018:ffff880331c01e38 EFLAGS: 00000046
> Jan 13 12:59:55 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:55 mail kernel: RDX: 0000000000000000 RSI: ffff880331c01fd8 RDI: 0000000000000003
> Jan 13 12:59:55 mail kernel: RBP: ffff880331c01e48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:55 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:55 mail kernel: R13: 118884b1f84e3244 R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:55 mail kernel: FS: 0000000000000000(0000) GS:ffff8800282e0000(0000) knlGS:0000000000000000
> Jan 13 12:59:55 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:55 mail kernel: CR2: 000000000199f058 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:55 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:55 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:55 mail kernel: Process swapper (pid: 0, threadinfo ffff880331c00000, task ffff8803325f2f00)
> Jan 13 12:59:55 mail kernel: Stack:
> Jan 13 12:59:55 mail kernel: ffff88032eb2b5a8 ffff88032eb2b000 ffff880331c01e58 ffffffff8101e23e
> Jan 13 12:59:55 mail kernel: <0> ffff880331c01e78 ffffffff812ab088 000000004b4e0a1b ffff88032eb2b5a8
> Jan 13 12:59:55 mail kernel: <0> ffff880331c01ed8 ffffffff812ab348 0000000000000000 0000000000000309
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:55 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:55 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:55 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:55 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: NMI backtrace for cpu 3
> Jan 13 12:59:55 mail kernel: CPU 3
> Jan 13 12:59:55 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:55 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: RSP: 0018:ffff88033259fe38 EFLAGS: 00000046
> Jan 13 12:59:55 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:55 mail kernel: RDX: 0000000000000000 RSI: ffff88033259ffd8 RDI: 0000000000000003
> Jan 13 12:59:55 mail kernel: RBP: ffff88033259fe48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:55 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:55 mail kernel: R13: 118884b1f84e6383 R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:55 mail kernel: FS: 0000000000000000(0000) GS:ffff880028260000(0000) knlGS:0000000000000000
> Jan 13 12:59:55 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:55 mail kernel: CR2: 00007fc73f1c7000 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:55 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:55 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:55 mail kernel: Process swapper (pid: 0, threadinfo ffff88033259e000, task ffff880332579780)
> Jan 13 12:59:55 mail kernel: Stack:
> Jan 13 12:59:55 mail kernel: ffff880331e5f5a8 ffff880331e5f000 ffff88033259fe58 ffffffff8101e23e
> Jan 13 12:59:55 mail kernel: <0> ffff88033259fe78 ffffffff812ab088 000000004b4e0a1b ffff880331e5f5a8
> Jan 13 12:59:55 mail kernel: <0> ffff88033259fed8 ffffffff812ab348 0000000000000000 0000000000000323
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:55 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:55 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:55 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:55 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: NMI backtrace for cpu 0
> Jan 13 12:59:55 mail kernel: CPU 0
> Jan 13 12:59:55 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:55 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: RSP: 0018:ffffffff81a01e28 EFLAGS: 00000046
> Jan 13 12:59:55 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:55 mail kernel: RDX: 0000000000000000 RSI: ffffffff81a01fd8 RDI: 0000000000000003
> Jan 13 12:59:55 mail kernel: RBP: ffffffff81a01e38 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:55 mail kernel: R10: 0000000000000000 R11: 00000000000002ff R12: 0000000000000001
> Jan 13 12:59:55 mail kernel: R13: 118884b1f84e83f4 R14: ffffffffffffffff R15: 00000000000936b0
> Jan 13 12:59:55 mail kernel: FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
> Jan 13 12:59:55 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:55 mail kernel: CR2: 00000000018331b0 CR3: 0000000001a34000 CR4: 00000000000006f0
> Jan 13 12:59:55 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:55 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:55 mail kernel: Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a3c020)
> Jan 13 12:59:55 mail kernel: Stack:
> Jan 13 12:59:55 mail kernel: ffff880331e5c5a8 ffff880331e5c000 ffffffff81a01e48 ffffffff8101e23e
> Jan 13 12:59:55 mail kernel: <0> ffffffff81a01e68 ffffffff812ab088 000000004b4e0a1b ffff880331e5c5a8
> Jan 13 12:59:55 mail kernel: <0> ffffffff81a01ec8 ffffffff812ab348 0000000000000000 0000000000000314
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff8144485a>] rest_init+0x7e/0x80
> Jan 13 12:59:55 mail kernel: [<ffffffff81b00d83>] start_kernel+0x427/0x432
> Jan 13 12:59:55 mail kernel: [<ffffffff81b002bc>] x86_64_start_reservations+0xa7/0xab
> Jan 13 12:59:55 mail kernel: [<ffffffff81b003b8>] x86_64_start_kernel+0xf8/0x107
> Jan 13 12:59:55 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:55 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:55 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:55 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:55 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff8144485a>] rest_init+0x7e/0x80
> Jan 13 12:59:55 mail kernel: [<ffffffff81b00d83>] start_kernel+0x427/0x432
> Jan 13 12:59:55 mail kernel: [<ffffffff81b002bc>] x86_64_start_reservations+0xa7/0xab
> Jan 13 12:59:55 mail kernel: [<ffffffff81b003b8>] x86_64_start_kernel+0xf8/0x107
> Jan 13 12:59:55 mail kernel: NMI backtrace for cpu 5
> Jan 13 12:59:55 mail kernel: CPU 5
> Jan 13 12:59:55 mail kernel: Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1 P6T DELUXE V2/System Product Name
> Jan 13 12:59:55 mail kernel: RIP: 0010:[<ffffffff81011655>] [<ffffffff81011655>] mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: RSP: 0018:ffff8803325d1e38 EFLAGS: 00000046
> Jan 13 12:59:55 mail kernel: RAX: 0000000000000020 RBX: 0000000000000020 RCX: 0000000000000001
> Jan 13 12:59:55 mail kernel: RDX: 0000000000000000 RSI: ffff8803325d1fd8 RDI: 0000000000000003
> Jan 13 12:59:55 mail kernel: RBP: ffff8803325d1e48 R08: 0000000000000000 R09: 0000000000000002
> Jan 13 12:59:55 mail kernel: R10: 0000000000000000 R11: 00000000000005fe R12: 0000000000000001
> Jan 13 12:59:55 mail kernel: R13: 118884b1f84e8572 R14: 0000000000000000 R15: 0000000000000000
> Jan 13 12:59:55 mail kernel: FS: 0000000000000000(0000) GS:ffff8800282a0000(0000) knlGS:0000000000000000
> Jan 13 12:59:55 mail kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Jan 13 12:59:55 mail kernel: CR2: 0000003a542a5010 CR3: 0000000001a34000 CR4: 00000000000006e0
> Jan 13 12:59:55 mail kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Jan 13 12:59:55 mail kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Jan 13 12:59:55 mail kernel: Process swapper (pid: 0, threadinfo ffff8803325d0000, task ffff8803325bde00)
> Jan 13 12:59:55 mail kernel: Stack:
> Jan 13 12:59:55 mail kernel: ffff88032eb295a8 ffff88032eb29000 ffff8803325d1e58 ffffffff8101e23e
> Jan 13 12:59:55 mail kernel: <0> ffff8803325d1e78 ffffffff812ab088 000000004b4e0a1b ffff88032eb295a8
> Jan 13 12:59:55 mail kernel: <0> ffff8803325d1ed8 ffffffff812ab348 0000000000000000 0000000000000313
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 12:59:55 mail kernel: Code: 25 08 cc 00 00 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 09 48 89 d8 4c 89 e1 0f 01 c9 <5b> 41 5c c9 c3 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 65 48
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <#DB[1]> <<EOE>> Pid: 0, comm: swapper Tainted: G W 2.6.33-rc4WITHMMAPNODMARSKY2SKBMAYPULL-00001-g033d717-dirty #1
> Jan 13 12:59:55 mail kernel: Call Trace:
> Jan 13 12:59:55 mail kernel: <NMI> [<ffffffff81008889>] ? show_regs+0x2b/0x30
> Jan 13 12:59:55 mail kernel: [<ffffffff81459268>] nmi_watchdog_tick+0xc2/0x1a0
> Jan 13 12:59:55 mail kernel: [<ffffffff81458879>] do_nmi+0xc4/0x28e
> Jan 13 12:59:55 mail kernel: [<ffffffff81458270>] nmi+0x20/0x39
> Jan 13 12:59:55 mail kernel: [<ffffffff81011655>] ? mwait_idle_with_hints+0x82/0x87
> Jan 13 12:59:55 mail kernel: <<EOE>> [<ffffffff8101e23e>] acpi_processor_ffh_cstate_enter+0x32/0x34
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab088>] acpi_idle_do_entry+0x26/0x46
> Jan 13 12:59:55 mail kernel: [<ffffffff812ab348>] acpi_idle_enter_bm+0x1d0/0x28a
> Jan 13 12:59:55 mail kernel: [<ffffffff8145a7d9>] ? notifier_call_chain+0x14/0x63
> Jan 13 12:59:55 mail kernel: [<ffffffff81390b5c>] cpuidle_idle_call+0x9e/0xfa
> Jan 13 12:59:55 mail kernel: [<ffffffff81008c05>] cpu_idle+0xb4/0xf6
> Jan 13 12:59:55 mail kernel: [<ffffffff81450ff5>] start_secondary+0x201/0x242
> Jan 13 13:00:17 mail nmbd[5145]: [2010/01/13 13:00:17, 0] nmbd/nmbd_become_lmb.c:395(become_local_master_stage2)
> Jan 13 13:00:17 mail nmbd[5145]: *****
> Jan 13 13:00:17 mail nmbd[5145]:
> Jan 13 13:00:17 mail nmbd[5145]: Samba name server MAJJAS is now a local master browser for workgroup MAJJAS on subnet 192.168.122.1
> Jan 13 13:00:17 mail nmbd[5145]:
> Jan 13 13:00:17 mail nmbd[5145]: *****

2010-01-24 02:49:37

by Michael Breuer

[permalink] [raw]

Subject: Bisected rcu hang (kernel/sched.c): was 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.

On 01/13/2010 01:43 PM, Michael Breuer wrote:
> [Originally posted as: "Re: 2.6.33RC3 libvirtd ->sky2 & rcu oops (was
> Sky2 oops - Driver tries to sync DMA memory it has not allocated)"]
>
> On 1/11/2010 8:49 PM, Paul E. McKenney wrote:
>> On Sun, Jan 10, 2010 at 03:10:03PM -0500, Michael Breuer wrote:
>>> On 1/9/2010 5:21 PM, Michael Breuer wrote:
>>>> Hi,
>>>>
>>>> Attempting to move back to mainline after my recent 2.6.32 issues...
>>>> Config is make oldconfig from working 2.6.32 config. Patch for
>>>> af_packet.c
>>>> (for skb issue found in 2.6.32) included. Attaching .config and NMI
>>>> backtraces.
>>>>
>>>> System becomes unusable after bringing up the network:
>>>>
>>>> ...
>> RCU stall warnings are usually due to an infinite loop somewhere in the
>> kernel. If you are running !CONFIG_PREEMPT, then any infinite loop not
>> containing some call to schedule will get you a stall warning. If you
>> are running CONFIG_PREEMPT, then the infinite loop is in some section of
>> code with preemption disabled (or irqs disabled).
>>
>> The stall-warning dump will normally finger one or more of the CPUs.
>> Since you are getting repeated warnings, look at the stacks and see
>> which of the most-recently-called functions stays the same in successive
>> stack traces. This information should help you finger the infinite (or
>> longer than average) loop.
>> ...
> I can now recreate this simply by "service start libvirtd" on an F12
> box. My earlier report that suggested this had something to do with
> the sky2 driver was incorrect. Interestingly, it's always CPU1
> whenever I start libvirtd.
> Attaching two of the traces (I've got about ten, but they're all
> pretty much the same). Looks pretty consistent - libvirtd in CPU1 is
> hung forking. Not sure why yet - perhaps someone who knows this better
> than I can jump in.
> Summary of hang appears to be libvirtd forks - two threads show with
> same pid deadlocked on a spin_lock
>> Then if looking at the stack traces doesn't locate the offending loop,
>> bisection might help.
> It would, however it's going to be really difficult as I wasn't able
> to get this far with rc1 & rc2 :(
>> Thanx, Paul
>
I was finally able to bisect this to commit:
3802290628348674985d14914f9bfee7b9084548 (see below)

Libvirtd always triggers the crash; other things that fork and use mmap
sometimes do (vsftpd, for example).

Author: Peter Zijlstra <[email protected]> 2009-12-16 12:04:37
Committer: Ingo Molnar <[email protected]> 2009-12-16 13:01:56
Parent: e2912009fb7b715728311b0d8fe327a1432b3f79 (sched: Ensure
set_task_cpu() is never called on blocked tasks)
Branches: remotes/origin/master
Follows: v2.6.32
Precedes: v2.6.33-rc2

sched: Fix sched_exec() balancing

Since we access ->cpus_allowed without holding rq->lock we need
a retry loop to validate the result, this comes for near free
when we merge sched_migrate_task() into sched_exec() since that
already does the needed check.

Signed-off-by: Peter Zijlstra <[email protected]>
Cc: Mike Galbraith <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

-------------------------------- kernel/sched.c
--------------------------------
index 33d7965..63e55ac 100644
@@ -2322,7 +2322,7 @@ void task_oncpu_function_call(struct task_struct *p,
*
* - fork, @p is stable because it isn't on the tasklist yet
*
- * - exec, @p is unstable XXX
+ * - exec, @p is unstable, retry loop
*
* - wake-up, we serialize ->cpus_allowed against TASK_WAKING so
* we should be good.
@@ -3132,21 +3132,36 @@ static void double_rq_unlock(struct rq *rq1,
struct rq *rq2)
}

/*
- * If dest_cpu is allowed for this process, migrate the task to it.
- * This is accomplished by forcing the cpu_allowed mask to only
- * allow dest_cpu, which will force the cpu onto dest_cpu. Then
- * the cpu_allowed mask is restored.
+ * sched_exec - execve() is a valuable balancing opportunity, because at
+ * this point the task has the smallest effective memory and cache
footprint.
*/
-static void sched_migrate_task(struct task_struct *p, int dest_cpu)
+void sched_exec(void)
{
+ struct task_struct *p = current;
struct migration_req req;
+ int dest_cpu, this_cpu;
unsigned long flags;
struct rq *rq;

+again:
+ this_cpu = get_cpu();
+ dest_cpu = select_task_rq(p, SD_BALANCE_EXEC, 0);
+ if (dest_cpu == this_cpu) {
+ put_cpu();
+ return;
+ }
+
rq = task_rq_lock(p, &flags);
+ put_cpu();
+
+ /*
+ * select_task_rq() can race against ->cpus_allowed
+ */
if (!cpumask_test_cpu(dest_cpu, &p->cpus_allowed)
- || unlikely(!cpu_active(dest_cpu)))
- goto out;
+ || unlikely(!cpu_active(dest_cpu))) {
+ task_rq_unlock(rq, &flags);
+ goto again;
+ }

/* force the process onto the specified CPU */
if (migrate_task(p, dest_cpu, &req)) {
@@ -3161,24 +3176,10 @@ static void sched_migrate_task(struct
task_struct *p, int dest_cpu)

return;
}
-out:
task_rq_unlock(rq, &flags);
}

/*
- * sched_exec - execve() is a valuable balancing opportunity, because at
- * this point the task has the smallest effective memory and cache
footprint.
- */
-void sched_exec(void)
-{
- int new_cpu, this_cpu = get_cpu();
- new_cpu = select_task_rq(current, SD_BALANCE_EXEC, 0);
- put_cpu();
- if (new_cpu != this_cpu)
- sched_migrate_task(current, new_cpu);
-}
-
-/*
* pull_task - move a task from a remote runqueue to the local runqueue.
* Both runqueues must be locked.
*/

2010-01-24 06:00:18

by Mike Galbraith

[permalink] [raw]

Subject: Re: Bisected rcu hang (kernel/sched.c): was 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.

On Sat, 2010-01-23 at 21:49 -0500, Michael Breuer wrote:
> On 01/13/2010 01:43 PM, Michael Breuer wrote:

> > I can now recreate this simply by "service start libvirtd" on an F12
> > box. My earlier report that suggested this had something to do with
> > the sky2 driver was incorrect. Interestingly, it's always CPU1
> > whenever I start libvirtd.
> > Attaching two of the traces (I've got about ten, but they're all
> > pretty much the same). Looks pretty consistent - libvirtd in CPU1 is
> > hung forking. Not sure why yet - perhaps someone who knows this better
> > than I can jump in.
> > Summary of hang appears to be libvirtd forks - two threads show with
> > same pid deadlocked on a spin_lock
> >> Then if looking at the stack traces doesn't locate the offending loop,
> >> bisection might help.
> > It would, however it's going to be really difficult as I wasn't able
> > to get this far with rc1 & rc2 :(
> >> Thanx, Paul
> >
> I was finally able to bisect this to commit:
> 3802290628348674985d14914f9bfee7b9084548 (see below)

I suspect something went wrong during bisection, however...

Jan 13 12:59:25 mail kernel: [<ffffffff810447be>] ? set_cpus_allowed_ptr+0x22/0x14b
Jan 13 12:59:25 mail kernel: [<ffffffff810f2a7b>] ? spin_lock+0xe/0x10
Jan 13 12:59:25 mail kernel: [<ffffffff81087d79>] cpuset_attach_task+0x27/0x9b
Jan 13 12:59:25 mail kernel: [<ffffffff81087e77>] cpuset_attach+0x8a/0x133
Jan 13 12:59:25 mail kernel: [<ffffffff81042d2c>] ? sched_move_task+0x104/0x110
Jan 13 12:59:25 mail kernel: [<ffffffff81085dbd>] cgroup_attach_task+0x4e1/0x53f
Jan 13 12:59:25 mail kernel: [<ffffffff81084f48>] ? cgroup_populate_dir+0x77/0xff
Jan 13 12:59:25 mail kernel: [<ffffffff81086073>] cgroup_clone+0x258/0x2ac
Jan 13 12:59:25 mail kernel: [<ffffffff81088d04>] ns_cgroup_clone+0x58/0x75
Jan 13 12:59:25 mail kernel: [<ffffffff81048f3d>] copy_process+0xcef/0x13af
Jan 13 12:59:25 mail kernel: [<ffffffff810d963c>] ? handle_mm_fault+0x355/0x7ff
Jan 13 12:59:25 mail kernel: [<ffffffff81049768>] do_fork+0x16b/0x309
Jan 13 12:59:25 mail kernel: [<ffffffff81252ab2>] ? __up_read+0x8e/0x97
Jan 13 12:59:25 mail kernel: [<ffffffff81068c92>] ? up_read+0xe/0x10
Jan 13 12:59:25 mail kernel: [<ffffffff8145a779>] ? do_page_fault+0x280/0x2cc
Jan 13 12:59:25 mail kernel: [<ffffffff81010f2e>] sys_clone+0x28/0x2a
Jan 13 12:59:25 mail kernel: [<ffffffff81009f33>] stub_clone+0x13/0x20
Jan 13 12:59:25 mail kernel: [<ffffffff81009bf2>] ? system_call_fastpath+0x16/0x1b

...that looks like a bug which has already been fixed in tip, but not
yet propagated. Your trace looks like relax forever scenario.

commit fabf318e5e4bda0aca2b0d617b191884fda62703
Author: Peter Zijlstra <[email protected]>
Date: Thu Jan 21 21:04:57 2010 +0100

sched: Fix fork vs hotplug vs cpuset namespaces

There are a number of issues:

1) TASK_WAKING vs cgroup_clone (cpusets)

copy_process():

sched_fork()
child->state = TASK_WAKING; /* waiting for wake_up_new_task() */
if (current->nsproxy != p->nsproxy)
ns_cgroup_clone()
cgroup_clone()
mutex_lock(inode->i_mutex)
mutex_lock(cgroup_mutex)
cgroup_attach_task()
ss->can_attach()
ss->attach() [ -> cpuset_attach() ]
cpuset_attach_task()
set_cpus_allowed_ptr();
while (child->state == TASK_WAKING)
cpu_relax();
will deadlock the system.

2) cgroup_clone (cpusets) vs copy_process

So even if the above would work we still have:

copy_process():

if (current->nsproxy != p->nsproxy)
ns_cgroup_clone()
cgroup_clone()
mutex_lock(inode->i_mutex)
mutex_lock(cgroup_mutex)
cgroup_attach_task()
ss->can_attach()
ss->attach() [ -> cpuset_attach() ]
cpuset_attach_task()
set_cpus_allowed_ptr();
...

p->cpus_allowed = current->cpus_allowed

over-writing the modified cpus_allowed.

3) fork() vs hotplug

if we unplug the child's cpu after the sanity check when the child
gets attached to the task_list but before wake_up_new_task() shit
will meet with fan.

Solve all these issues by moving fork cpu selection into
wake_up_new_task().

Reported-by: Serge E. Hallyn <[email protected]>
Tested-by: Serge E. Hallyn <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
LKML-Reference: <1264106190.4283.1314.camel@laptop>
Signed-off-by: Thomas Gleixner <[email protected]>

diff --git a/kernel/fork.c b/kernel/fork.c
index 5b2959b..f88bd98 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1241,21 +1241,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
/* Need tasklist lock for parent etc handling! */
write_lock_irq(&tasklist_lock);

- /*
- * The task hasn't been attached yet, so its cpus_allowed mask will
- * not be changed, nor will its assigned CPU.
- *
- * The cpus_allowed mask of the parent may have changed after it was
- * copied first time - so re-copy it here, then check the child's CPU
- * to ensure it is on a valid CPU (and if not, just force it back to
- * parent's CPU). This avoids alot of nasty races.
- */
- p->cpus_allowed = current->cpus_allowed;
- p->rt.nr_cpus_allowed = current->rt.nr_cpus_allowed;
- if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed) ||
- !cpu_online(task_cpu(p))))
- set_task_cpu(p, smp_processor_id());
-
/* CLONE_PARENT re-uses the old parent */
if (clone_flags & (CLONE_PARENT|CLONE_THREAD)) {
p->real_parent = current->real_parent;
diff --git a/kernel/sched.c b/kernel/sched.c
index 4508fe7..3a8fb30 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2320,14 +2320,12 @@ static int select_fallback_rq(int cpu, struct task_struct *p)
}

/*
- * Called from:
+ * Gets called from 3 sites (exec, fork, wakeup), since it is called without
+ * holding rq->lock we need to ensure ->cpus_allowed is stable, this is done
+ * by:
*
- * - fork, @p is stable because it isn't on the tasklist yet
- *
- * - exec, @p is unstable, retry loop
- *
- * - wake-up, we serialize ->cpus_allowed against TASK_WAKING so
- * we should be good.
+ * exec: is unstable, retry loop
+ * fork & wake-up: serialize ->cpus_allowed against TASK_WAKING
*/
static inline
int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
@@ -2620,9 +2618,6 @@ void sched_fork(struct task_struct *p, int clone_flags)
if (p->sched_class->task_fork)
p->sched_class->task_fork(p);

-#ifdef CONFIG_SMP
- cpu = select_task_rq(p, SD_BALANCE_FORK, 0);
-#endif
set_task_cpu(p, cpu);

#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
@@ -2652,6 +2647,21 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
{
unsigned long flags;
struct rq *rq;
+ int cpu = get_cpu();
+
+#ifdef CONFIG_SMP
+ /*
+ * Fork balancing, do it here and not earlier because:
+ * - cpus_allowed can change in the fork path
+ * - any previously selected cpu might disappear through hotplug
+ *
+ * We still have TASK_WAKING but PF_STARTING is gone now, meaning
+ * ->cpus_allowed is stable, we have preemption disabled, meaning
+ * cpu_online_mask is stable.
+ */
+ cpu = select_task_rq(p, SD_BALANCE_FORK, 0);
+ set_task_cpu(p, cpu);
+#endif

rq = task_rq_lock(p, &flags);
BUG_ON(p->state != TASK_WAKING);
@@ -2665,6 +2675,7 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
p->sched_class->task_woken(rq, p);
#endif
task_rq_unlock(rq, &flags);
+ put_cpu();
}

#ifdef CONFIG_PREEMPT_NOTIFIERS
@@ -7139,14 +7150,18 @@ int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
* the ->cpus_allowed mask from under waking tasks, which would be
* possible when we change rq->lock in ttwu(), so synchronize against
* TASK_WAKING to avoid that.
+ *
+ * Make an exception for freshly cloned tasks, since cpuset namespaces
+ * might move the task about, we have to validate the target in
+ * wake_up_new_task() anyway since the cpu might have gone away.
*/
again:
- while (p->state == TASK_WAKING)
+ while (p->state == TASK_WAKING && !(p->flags & PF_STARTING))
cpu_relax();

rq = task_rq_lock(p, &flags);

- if (p->state == TASK_WAKING) {
+ if (p->state == TASK_WAKING && !(p->flags & PF_STARTING)) {
task_rq_unlock(rq, &flags);
goto again;
}

2010-01-24 06:32:30

by Michael Breuer

[permalink] [raw]

Subject: Re: Bisected rcu hang (kernel/sched.c): was 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.

On 1/24/2010 12:59 AM, Mike Galbraith wrote:
> On Sat, 2010-01-23 at 21:49 -0500, Michael Breuer wrote:
>
>> On 01/13/2010 01:43 PM, Michael Breuer wrote:
>>
>
>>> I can now recreate this simply by "service start libvirtd" on an F12
>>> box. My earlier report that suggested this had something to do with
>>> the sky2 driver was incorrect. Interestingly, it's always CPU1
>>> whenever I start libvirtd.
>>> Attaching two of the traces (I've got about ten, but they're all
>>> pretty much the same). Looks pretty consistent - libvirtd in CPU1 is
>>> hung forking. Not sure why yet - perhaps someone who knows this better
>>> than I can jump in.
>>> Summary of hang appears to be libvirtd forks - two threads show with
>>> same pid deadlocked on a spin_lock
>>>
>>>> Then if looking at the stack traces doesn't locate the offending loop,
>>>> bisection might help.
>>>>
>>> It would, however it's going to be really difficult as I wasn't able
>>> to get this far with rc1& rc2 :(
>>>
>>>> Thanx, Paul
>>>>
>>>
>> I was finally able to bisect this to commit:
>> 3802290628348674985d14914f9bfee7b9084548 (see below)
>>
> I suspect something went wrong during bisection, however...
>
> Jan 13 12:59:25 mail kernel: [<ffffffff810447be>] ? set_cpus_allowed_ptr+0x22/0x14b
> Jan 13 12:59:25 mail kernel: [<ffffffff810f2a7b>] ? spin_lock+0xe/0x10
> Jan 13 12:59:25 mail kernel: [<ffffffff81087d79>] cpuset_attach_task+0x27/0x9b
> Jan 13 12:59:25 mail kernel: [<ffffffff81087e77>] cpuset_attach+0x8a/0x133
> Jan 13 12:59:25 mail kernel: [<ffffffff81042d2c>] ? sched_move_task+0x104/0x110
> Jan 13 12:59:25 mail kernel: [<ffffffff81085dbd>] cgroup_attach_task+0x4e1/0x53f
> Jan 13 12:59:25 mail kernel: [<ffffffff81084f48>] ? cgroup_populate_dir+0x77/0xff
> Jan 13 12:59:25 mail kernel: [<ffffffff81086073>] cgroup_clone+0x258/0x2ac
> Jan 13 12:59:25 mail kernel: [<ffffffff81088d04>] ns_cgroup_clone+0x58/0x75
> Jan 13 12:59:25 mail kernel: [<ffffffff81048f3d>] copy_process+0xcef/0x13af
> Jan 13 12:59:25 mail kernel: [<ffffffff810d963c>] ? handle_mm_fault+0x355/0x7ff
> Jan 13 12:59:25 mail kernel: [<ffffffff81049768>] do_fork+0x16b/0x309
> Jan 13 12:59:25 mail kernel: [<ffffffff81252ab2>] ? __up_read+0x8e/0x97
> Jan 13 12:59:25 mail kernel: [<ffffffff81068c92>] ? up_read+0xe/0x10
> Jan 13 12:59:25 mail kernel: [<ffffffff8145a779>] ? do_page_fault+0x280/0x2cc
> Jan 13 12:59:25 mail kernel: [<ffffffff81010f2e>] sys_clone+0x28/0x2a
> Jan 13 12:59:25 mail kernel: [<ffffffff81009f33>] stub_clone+0x13/0x20
> Jan 13 12:59:25 mail kernel: [<ffffffff81009bf2>] ? system_call_fastpath+0x16/0x1b
>
> ...that looks like a bug which has already been fixed in tip, but not
> yet propagated. Your trace looks like relax forever scenario.
>
> commit fabf318e5e4bda0aca2b0d617b191884fda62703
> Author: Peter Zijlstra<[email protected]>
> Date: Thu Jan 21 21:04:57 2010 +0100
>
> sched: Fix fork vs hotplug vs cpuset namespaces
>
> There are a number of issues:
>
> 1) TASK_WAKING vs cgroup_clone (cpusets)
>
> copy_process():
>
> sched_fork()
> child->state = TASK_WAKING; /* waiting for wake_up_new_task() */
> if (current->nsproxy != p->nsproxy)
> ns_cgroup_clone()
> cgroup_clone()
> mutex_lock(inode->i_mutex)
> mutex_lock(cgroup_mutex)
> cgroup_attach_task()
> ss->can_attach()
> ss->attach() [ -> cpuset_attach() ]
> cpuset_attach_task()
> set_cpus_allowed_ptr();
> while (child->state == TASK_WAKING)
> cpu_relax();
> will deadlock the system.
>
>
> 2) cgroup_clone (cpusets) vs copy_process
>
> So even if the above would work we still have:
>
> copy_process():
>
> if (current->nsproxy != p->nsproxy)
> ns_cgroup_clone()
> cgroup_clone()
> mutex_lock(inode->i_mutex)
> mutex_lock(cgroup_mutex)
> cgroup_attach_task()
> ss->can_attach()
> ss->attach() [ -> cpuset_attach() ]
> cpuset_attach_task()
> set_cpus_allowed_ptr();
> ...
>
> p->cpus_allowed = current->cpus_allowed
>
> over-writing the modified cpus_allowed.
>
>
> 3) fork() vs hotplug
>
> if we unplug the child's cpu after the sanity check when the child
> gets attached to the task_list but before wake_up_new_task() shit
> will meet with fan.
>
> Solve all these issues by moving fork cpu selection into
> wake_up_new_task().
>
> Reported-by: Serge E. Hallyn<[email protected]>
> Tested-by: Serge E. Hallyn<[email protected]>
> Signed-off-by: Peter Zijlstra<[email protected]>
> LKML-Reference:<1264106190.4283.1314.camel@laptop>
> Signed-off-by: Thomas Gleixner<[email protected]>
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 5b2959b..f88bd98 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1241,21 +1241,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> /* Need tasklist lock for parent etc handling! */
> write_lock_irq(&tasklist_lock);
>
> - /*
> - * The task hasn't been attached yet, so its cpus_allowed mask will
> - * not be changed, nor will its assigned CPU.
> - *
> - * The cpus_allowed mask of the parent may have changed after it was
> - * copied first time - so re-copy it here, then check the child's CPU
> - * to ensure it is on a valid CPU (and if not, just force it back to
> - * parent's CPU). This avoids alot of nasty races.
> - */
> - p->cpus_allowed = current->cpus_allowed;
> - p->rt.nr_cpus_allowed = current->rt.nr_cpus_allowed;
> - if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed) ||
> - !cpu_online(task_cpu(p))))
> - set_task_cpu(p, smp_processor_id());
> -
> /* CLONE_PARENT re-uses the old parent */
> if (clone_flags& (CLONE_PARENT|CLONE_THREAD)) {
> p->real_parent = current->real_parent;
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 4508fe7..3a8fb30 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -2320,14 +2320,12 @@ static int select_fallback_rq(int cpu, struct task_struct *p)
> }
>
> /*
> - * Called from:
> + * Gets called from 3 sites (exec, fork, wakeup), since it is called without
> + * holding rq->lock we need to ensure ->cpus_allowed is stable, this is done
> + * by:
> *
> - * - fork, @p is stable because it isn't on the tasklist yet
> - *
> - * - exec, @p is unstable, retry loop
> - *
> - * - wake-up, we serialize ->cpus_allowed against TASK_WAKING so
> - * we should be good.
> + * exec: is unstable, retry loop
> + * fork& wake-up: serialize ->cpus_allowed against TASK_WAKING
> */
> static inline
> int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
> @@ -2620,9 +2618,6 @@ void sched_fork(struct task_struct *p, int clone_flags)
> if (p->sched_class->task_fork)
> p->sched_class->task_fork(p);
>
> -#ifdef CONFIG_SMP
> - cpu = select_task_rq(p, SD_BALANCE_FORK, 0);
> -#endif
> set_task_cpu(p, cpu);
>
> #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> @@ -2652,6 +2647,21 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
> {
> unsigned long flags;
> struct rq *rq;
> + int cpu = get_cpu();
> +
> +#ifdef CONFIG_SMP
> + /*
> + * Fork balancing, do it here and not earlier because:
> + * - cpus_allowed can change in the fork path
> + * - any previously selected cpu might disappear through hotplug
> + *
> + * We still have TASK_WAKING but PF_STARTING is gone now, meaning
> + * ->cpus_allowed is stable, we have preemption disabled, meaning
> + * cpu_online_mask is stable.
> + */
> + cpu = select_task_rq(p, SD_BALANCE_FORK, 0);
> + set_task_cpu(p, cpu);
> +#endif
>
> rq = task_rq_lock(p,&flags);
> BUG_ON(p->state != TASK_WAKING);
> @@ -2665,6 +2675,7 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
> p->sched_class->task_woken(rq, p);
> #endif
> task_rq_unlock(rq,&flags);
> + put_cpu();
> }
>
> #ifdef CONFIG_PREEMPT_NOTIFIERS
> @@ -7139,14 +7150,18 @@ int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
> * the ->cpus_allowed mask from under waking tasks, which would be
> * possible when we change rq->lock in ttwu(), so synchronize against
> * TASK_WAKING to avoid that.
> + *
> + * Make an exception for freshly cloned tasks, since cpuset namespaces
> + * might move the task about, we have to validate the target in
> + * wake_up_new_task() anyway since the cpu might have gone away.
> */
> again:
> - while (p->state == TASK_WAKING)
> + while (p->state == TASK_WAKING&& !(p->flags& PF_STARTING))
> cpu_relax();
>
> rq = task_rq_lock(p,&flags);
>
> - if (p->state == TASK_WAKING) {
> + if (p->state == TASK_WAKING&& !(p->flags& PF_STARTING)) {
> task_rq_unlock(rq,&flags);
> goto again;
> }
>
>
That commit solves my crash. Was my first time bisecting... thought I
got it right. With the referenced commit, the system crashed when
libvirtd was started, at the previous commit, it didn't crash.
Regardless, the commit in tip fixes the issue. Hopefully it'll solve
some of the other reported RCU hangs as well. Looks like the change for
freshly cloned tasks is key.

2010-01-24 07:19:49

by Mike Galbraith

[permalink] [raw]

Subject: Re: Bisected rcu hang (kernel/sched.c): was 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.

On Sun, 2010-01-24 at 01:32 -0500, Michael Breuer wrote:

> That commit solves my crash. Was my first time bisecting... thought I
> got it right. With the referenced commit, the system crashed when
> libvirtd was started, at the previous commit, it didn't crash.
> Regardless, the commit in tip fixes the issue. Hopefully it'll solve
> some of the other reported RCU hangs as well. Looks like the change for
> freshly cloned tasks is key.

Great, and yeah, the freshly cloned bit is the fix for your scenario.

-Mike

2010-01-25 16:03:50

by Peter Zijlstra

[permalink] [raw]

Subject: Re: Bisected rcu hang (kernel/sched.c): was 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.

On Sat, 2010-01-23 at 21:49 -0500, Michael Breuer wrote:
> Libvirtd always triggers the crash; other things that fork and use mmap
> sometimes do (vsftpd, for example).

I bet its the nscgroup trainwreck, does this fix it?

---
commit fabf318e5e4bda0aca2b0d617b191884fda62703
Author: Peter Zijlstra <[email protected]>
Date: Thu Jan 21 21:04:57 2010 +0100

sched: Fix fork vs hotplug vs cpuset namespaces

There are a number of issues:

1) TASK_WAKING vs cgroup_clone (cpusets)

copy_process():

sched_fork()
child->state = TASK_WAKING; /* waiting for wake_up_new_task() */
if (current->nsproxy != p->nsproxy)
ns_cgroup_clone()
cgroup_clone()
mutex_lock(inode->i_mutex)
mutex_lock(cgroup_mutex)
cgroup_attach_task()
ss->can_attach()
ss->attach() [ -> cpuset_attach() ]
cpuset_attach_task()
set_cpus_allowed_ptr();
while (child->state == TASK_WAKING)
cpu_relax();
will deadlock the system.

2) cgroup_clone (cpusets) vs copy_process

So even if the above would work we still have:

copy_process():

if (current->nsproxy != p->nsproxy)
ns_cgroup_clone()
cgroup_clone()
mutex_lock(inode->i_mutex)
mutex_lock(cgroup_mutex)
cgroup_attach_task()
ss->can_attach()
ss->attach() [ -> cpuset_attach() ]
cpuset_attach_task()
set_cpus_allowed_ptr();
...

p->cpus_allowed = current->cpus_allowed

over-writing the modified cpus_allowed.

3) fork() vs hotplug

if we unplug the child's cpu after the sanity check when the child
gets attached to the task_list but before wake_up_new_task() shit
will meet with fan.

Solve all these issues by moving fork cpu selection into
wake_up_new_task().

Reported-by: Serge E. Hallyn <[email protected]>
Tested-by: Serge E. Hallyn <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
LKML-Reference: <1264106190.4283.1314.camel@laptop>
Signed-off-by: Thomas Gleixner <[email protected]>

diff --git a/kernel/fork.c b/kernel/fork.c
index 5b2959b..f88bd98 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1241,21 +1241,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
/* Need tasklist lock for parent etc handling! */
write_lock_irq(&tasklist_lock);

- /*
- * The task hasn't been attached yet, so its cpus_allowed mask will
- * not be changed, nor will its assigned CPU.
- *
- * The cpus_allowed mask of the parent may have changed after it was
- * copied first time - so re-copy it here, then check the child's CPU
- * to ensure it is on a valid CPU (and if not, just force it back to
- * parent's CPU). This avoids alot of nasty races.
- */
- p->cpus_allowed = current->cpus_allowed;
- p->rt.nr_cpus_allowed = current->rt.nr_cpus_allowed;
- if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed) ||
- !cpu_online(task_cpu(p))))
- set_task_cpu(p, smp_processor_id());
-
/* CLONE_PARENT re-uses the old parent */
if (clone_flags & (CLONE_PARENT|CLONE_THREAD)) {
p->real_parent = current->real_parent;
diff --git a/kernel/sched.c b/kernel/sched.c
index 4508fe7..3a8fb30 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2320,14 +2320,12 @@ static int select_fallback_rq(int cpu, struct task_struct *p)
}

/*
- * Called from:
+ * Gets called from 3 sites (exec, fork, wakeup), since it is called without
+ * holding rq->lock we need to ensure ->cpus_allowed is stable, this is done
+ * by:
*
- * - fork, @p is stable because it isn't on the tasklist yet
- *
- * - exec, @p is unstable, retry loop
- *
- * - wake-up, we serialize ->cpus_allowed against TASK_WAKING so
- * we should be good.
+ * exec: is unstable, retry loop
+ * fork & wake-up: serialize ->cpus_allowed against TASK_WAKING
*/
static inline
int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
@@ -2620,9 +2618,6 @@ void sched_fork(struct task_struct *p, int clone_flags)
if (p->sched_class->task_fork)
p->sched_class->task_fork(p);

-#ifdef CONFIG_SMP
- cpu = select_task_rq(p, SD_BALANCE_FORK, 0);
-#endif
set_task_cpu(p, cpu);

#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
@@ -2652,6 +2647,21 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
{
unsigned long flags;
struct rq *rq;
+ int cpu = get_cpu();
+
+#ifdef CONFIG_SMP
+ /*
+ * Fork balancing, do it here and not earlier because:
+ * - cpus_allowed can change in the fork path
+ * - any previously selected cpu might disappear through hotplug
+ *
+ * We still have TASK_WAKING but PF_STARTING is gone now, meaning
+ * ->cpus_allowed is stable, we have preemption disabled, meaning
+ * cpu_online_mask is stable.
+ */
+ cpu = select_task_rq(p, SD_BALANCE_FORK, 0);
+ set_task_cpu(p, cpu);
+#endif

rq = task_rq_lock(p, &flags);
BUG_ON(p->state != TASK_WAKING);
@@ -2665,6 +2675,7 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
p->sched_class->task_woken(rq, p);
#endif
task_rq_unlock(rq, &flags);
+ put_cpu();
}

#ifdef CONFIG_PREEMPT_NOTIFIERS
@@ -7139,14 +7150,18 @@ int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
* the ->cpus_allowed mask from under waking tasks, which would be
* possible when we change rq->lock in ttwu(), so synchronize against
* TASK_WAKING to avoid that.
+ *
+ * Make an exception for freshly cloned tasks, since cpuset namespaces
+ * might move the task about, we have to validate the target in
+ * wake_up_new_task() anyway since the cpu might have gone away.
*/
again:
- while (p->state == TASK_WAKING)
+ while (p->state == TASK_WAKING && !(p->flags & PF_STARTING))
cpu_relax();

rq = task_rq_lock(p, &flags);

- if (p->state == TASK_WAKING) {
+ if (p->state == TASK_WAKING && !(p->flags & PF_STARTING)) {
task_rq_unlock(rq, &flags);
goto again;
}

2010-01-25 16:14:31

by Michael Breuer

[permalink] [raw]

Subject: Re: Bisected rcu hang (kernel/sched.c): was 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd - reproducible.

On 1/25/2010 11:03 AM, Peter Zijlstra wrote:
> On Sat, 2010-01-23 at 21:49 -0500, Michael Breuer wrote:
>
>> Libvirtd always triggers the crash; other things that fork and use mmap
>> sometimes do (vsftpd, for example).
>>
> I bet its the nscgroup trainwreck, does this fix it?
>
> ---
> commit fabf318e5e4bda0aca2b0d617b191884fda62703
> Author: Peter Zijlstra<[email protected]>
> Date: Thu Jan 21 21:04:57 2010 +0100
>
> sched: Fix fork vs hotplug vs cpuset namespaces
>
> There are a number of issues:
>
> 1) TASK_WAKING vs cgroup_clone (cpusets)
>
> copy_process():
>
> sched_fork()
> child->state = TASK_WAKING; /* waiting for wake_up_new_task() */
> if (current->nsproxy != p->nsproxy)
> ns_cgroup_clone()
> cgroup_clone()
> mutex_lock(inode->i_mutex)
> mutex_lock(cgroup_mutex)
> cgroup_attach_task()
> ss->can_attach()
> ss->attach() [ -> cpuset_attach() ]
> cpuset_attach_task()
> set_cpus_allowed_ptr();
> while (child->state == TASK_WAKING)
> cpu_relax();
> will deadlock the system.
>
>
> 2) cgroup_clone (cpusets) vs copy_process
>
> So even if the above would work we still have:
>
> copy_process():
>
> if (current->nsproxy != p->nsproxy)
> ns_cgroup_clone()
> cgroup_clone()
> mutex_lock(inode->i_mutex)
> mutex_lock(cgroup_mutex)
> cgroup_attach_task()
> ss->can_attach()
> ss->attach() [ -> cpuset_attach() ]
> cpuset_attach_task()
> set_cpus_allowed_ptr();
> ...
>
> p->cpus_allowed = current->cpus_allowed
>
> over-writing the modified cpus_allowed.
>
>
> 3) fork() vs hotplug
>
> if we unplug the child's cpu after the sanity check when the child
> gets attached to the task_list but before wake_up_new_task() shit
> will meet with fan.
>
> Solve all these issues by moving fork cpu selection into
> wake_up_new_task().
>
> Reported-by: Serge E. Hallyn<[email protected]>
> Tested-by: Serge E. Hallyn<[email protected]>
> Signed-off-by: Peter Zijlstra<[email protected]>
> LKML-Reference:<1264106190.4283.1314.camel@laptop>
> Signed-off-by: Thomas Gleixner<[email protected]>
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 5b2959b..f88bd98 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1241,21 +1241,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> /* Need tasklist lock for parent etc handling! */
> write_lock_irq(&tasklist_lock);
>
> - /*
> - * The task hasn't been attached yet, so its cpus_allowed mask will
> - * not be changed, nor will its assigned CPU.
> - *
> - * The cpus_allowed mask of the parent may have changed after it was
> - * copied first time - so re-copy it here, then check the child's CPU
> - * to ensure it is on a valid CPU (and if not, just force it back to
> - * parent's CPU). This avoids alot of nasty races.
> - */
> - p->cpus_allowed = current->cpus_allowed;
> - p->rt.nr_cpus_allowed = current->rt.nr_cpus_allowed;
> - if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed) ||
> - !cpu_online(task_cpu(p))))
> - set_task_cpu(p, smp_processor_id());
> -
> /* CLONE_PARENT re-uses the old parent */
> if (clone_flags& (CLONE_PARENT|CLONE_THREAD)) {
> p->real_parent = current->real_parent;
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 4508fe7..3a8fb30 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -2320,14 +2320,12 @@ static int select_fallback_rq(int cpu, struct task_struct *p)
> }
>
> /*
> - * Called from:
> + * Gets called from 3 sites (exec, fork, wakeup), since it is called without
> + * holding rq->lock we need to ensure ->cpus_allowed is stable, this is done
> + * by:
> *
> - * - fork, @p is stable because it isn't on the tasklist yet
> - *
> - * - exec, @p is unstable, retry loop
> - *
> - * - wake-up, we serialize ->cpus_allowed against TASK_WAKING so
> - * we should be good.
> + * exec: is unstable, retry loop
> + * fork& wake-up: serialize ->cpus_allowed against TASK_WAKING
> */
> static inline
> int select_task_rq(struct task_struct *p, int sd_flags, int wake_flags)
> @@ -2620,9 +2618,6 @@ void sched_fork(struct task_struct *p, int clone_flags)
> if (p->sched_class->task_fork)
> p->sched_class->task_fork(p);
>
> -#ifdef CONFIG_SMP
> - cpu = select_task_rq(p, SD_BALANCE_FORK, 0);
> -#endif
> set_task_cpu(p, cpu);
>
> #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> @@ -2652,6 +2647,21 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
> {
> unsigned long flags;
> struct rq *rq;
> + int cpu = get_cpu();
> +
> +#ifdef CONFIG_SMP
> + /*
> + * Fork balancing, do it here and not earlier because:
> + * - cpus_allowed can change in the fork path
> + * - any previously selected cpu might disappear through hotplug
> + *
> + * We still have TASK_WAKING but PF_STARTING is gone now, meaning
> + * ->cpus_allowed is stable, we have preemption disabled, meaning
> + * cpu_online_mask is stable.
> + */
> + cpu = select_task_rq(p, SD_BALANCE_FORK, 0);
> + set_task_cpu(p, cpu);
> +#endif
>
> rq = task_rq_lock(p,&flags);
> BUG_ON(p->state != TASK_WAKING);
> @@ -2665,6 +2675,7 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
> p->sched_class->task_woken(rq, p);
> #endif
> task_rq_unlock(rq,&flags);
> + put_cpu();
> }
>
> #ifdef CONFIG_PREEMPT_NOTIFIERS
> @@ -7139,14 +7150,18 @@ int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
> * the ->cpus_allowed mask from under waking tasks, which would be
> * possible when we change rq->lock in ttwu(), so synchronize against
> * TASK_WAKING to avoid that.
> + *
> + * Make an exception for freshly cloned tasks, since cpuset namespaces
> + * might move the task about, we have to validate the target in
> + * wake_up_new_task() anyway since the cpu might have gone away.
> */
> again:
> - while (p->state == TASK_WAKING)
> + while (p->state == TASK_WAKING&& !(p->flags& PF_STARTING))
> cpu_relax();
>
> rq = task_rq_lock(p,&flags);
>
> - if (p->state == TASK_WAKING) {
> + if (p->state == TASK_WAKING&& !(p->flags& PF_STARTING)) {
> task_rq_unlock(rq,&flags);
> goto again;
> }
>
>
Yes this solved it. Applied yesterday after Mike Galbraith sent me the
same patch from tip.