Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758171AbXLOGAD (ORCPT ); Sat, 15 Dec 2007 01:00:03 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751799AbXLOF7u (ORCPT ); Sat, 15 Dec 2007 00:59:50 -0500 Received: from smtp2.linux-foundation.org ([207.189.120.14]:54848 "EHLO smtp2.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751636AbXLOF7t (ORCPT ); Sat, 15 Dec 2007 00:59:49 -0500 Date: Fri, 14 Dec 2007 21:58:51 -0800 From: Andrew Morton To: "Miles Lane" Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, "Ingo Molnar" , "Peter Zijlstra" Subject: Re: 2.6.24-rc5-mm1 -- inconsistent {in-softirq-W} -> {softirq-on-R} usage. Message-Id: <20071214215851.4719b631.akpm@linux-foundation.org> In-Reply-To: References: <20071214153633.5d8f609a.akpm@linux-foundation.org> X-Mailer: Sylpheed 2.4.1 (GTK+ 2.8.17; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7910 Lines: 178 On Fri, 14 Dec 2007 22:58:24 -0500 "Miles Lane" wrote: > On Dec 14, 2007 6:36 PM, Andrew Morton wrote: > > On Fri, 14 Dec 2007 17:13:21 -0500 > > "Miles Lane" wrote: > > > > > > > Sorry Andrew, I don't know who to forward this problem to. > > > > > > I tried running: find /proc | xargs cat > > > and got this: > > > > > > ================================= > > > [ INFO: inconsistent lock state ] > > > 2.6.24-rc5-mm1 #26 > > > --------------------------------- > > > inconsistent {in-softirq-W} -> {softirq-on-R} usage. > > > cat/6944 [HC0[0]:SC0[0]:HE1:SE1] takes: > > > BUG: unable to handle kernel paging request at virtual address 0f1eff0b > > > printing ip: c01fe64d *pde = 00000000 > > > Oops: 0000 [#1] PREEMPT SMP > > > last sysfs file: /sys/block/sda/sda3/stat > > > Modules linked in: aes_generic i915 drm rfcomm l2cap bluetooth > > > cpufreq_stats cpufreq_conservative cpufreq_performance sbs sbshc > > > dm_crypt sbp2 parport_pc lp parport pcmcia arc4 ecb crypto_blkcipher > > > cryptomgr crypto_algapi tifm_7xx1 tifm_core yenta_socket > > > rsrc_nonstatic pcmcia_core iwl3945 iTCO_wdt iTCO_vendor_support > > > watchdog_core watchdog_dev snd_hda_intel mac80211 snd_pcm_oss > > > snd_mixer_oss cfg80211 snd_pcm sky2 snd_seq_dummy snd_seq_oss > > > snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer > > > snd_seq_device snd soundcore snd_page_alloc shpchp pci_hotplug > > > firewire_ohci firewire_core crc_itu_t ata_generic piix ide_core > > > > > > Pid: 6944, comm: cat Not tainted (2.6.24-rc5-mm1 #26) > > > EIP: 0060:[] EFLAGS: 00210097 CPU: 0 > > > EIP is at strnlen+0x9/0x1c > > > EAX: 0f1eff0b EBX: 0f1eff0b ECX: 0f1eff0b EDX: fffffffe > > > ESI: c05b74f6 EDI: d6267d94 EBP: d6267cc8 ESP: d6267cc8 > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > > > Process cat (pid: 6944, ti=d6267000 task=d5a09000 task.ti=d6267000) > > > Stack: d6267cfc c01fdd22 00000400 c05b74f4 00000001 c05b78f4 00000000 ffffffff > > > ffffffff c048f503 00000400 d5a09000 00000002 d6267d0c c01fdf41 d6267d94 > > > db68c04a d6267d74 c012ae81 d6267d94 00000028 c05b89f7 00200046 00000000 > > > Call Trace: > > > [] show_trace_log_lvl+0x12/0x25 > > > [] show_stack_log_lvl+0x8a/0x95 > > > [] show_registers+0x8a/0x1bd > > > [] die+0x118/0x1dc > > > [] do_page_fault+0x5a4/0x681 > > > [] error_code+0x72/0x78 > > > [] vsnprintf+0x277/0x40e > > > [] vscnprintf+0xe/0x1d > > > [] vprintk+0xcb/0x2f3 > > > [] printk+0x15/0x17 > > > [] print_lock_name+0x4e/0xa2 > > > [] print_lock+0xe/0x3a > > > [] print_usage_bug+0xbc/0x117 > > > [] mark_lock+0x2e7/0x3fe > > > [] __lock_acquire+0x498/0xbf4 > > > [] lock_acquire+0x76/0x9d > > > [] _read_lock+0x23/0x32 > > > [] sock_i_ino+0x14/0x30 > > > [] packet_seq_show+0x22/0x75 > > > [] seq_read+0x19d/0x26f > > > [] proc_reg_read+0x60/0x74 > > > [] vfs_read+0x8a/0x106 > > > [] sys_read+0x3b/0x60 > > > [] sysenter_past_esp+0x6b/0xc1 > > > ======================= > > > Code: 01 00 00 00 4f 89 fa 5f 89 d0 5d c3 55 85 c9 89 e5 57 89 c7 89 > > > d0 74 05 f2 ae 75 01 4f 89 f8 5f 5d c3 55 89 c1 89 e5 89 c8 eb 06 <80> > > > 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 5d c3 90 90 90 55 83 > > > EIP: [] strnlen+0x9/0x1c SS:ESP 0068:d6267cc8 > > > note: cat[6944] exited with preempt_count 4 > > > > I'd say you hit a networking locking bug and then when trying to report > > that bug, lockdep crashed. > > > > The networking bug looks to be around sock_i_ino()'s taking of > > sk_callback_lock with softirq's enabled. Perhaps this will fix it. > > > > diff -puN net/core/sock.c~a net/core/sock.c > > --- a/net/core/sock.c~a > > +++ a/net/core/sock.c > > @@ -1115,9 +1115,9 @@ int sock_i_uid(struct sock *sk) > > { > > int uid; > > > > - read_lock(&sk->sk_callback_lock); > > + read_lock_bh(&sk->sk_callback_lock); > > uid = sk->sk_socket ? SOCK_INODE(sk->sk_socket)->i_uid : 0; > > - read_unlock(&sk->sk_callback_lock); > > + read_unlock_bh(&sk->sk_callback_lock); > > return uid; > > } > > > > @@ -1125,9 +1125,9 @@ unsigned long sock_i_ino(struct sock *sk > > { > > unsigned long ino; > > > > - read_lock(&sk->sk_callback_lock); > > + read_lock_bh(&sk->sk_callback_lock); > > ino = sk->sk_socket ? SOCK_INODE(sk->sk_socket)->i_ino : 0; > > - read_unlock(&sk->sk_callback_lock); > > + read_unlock_bh(&sk->sk_callback_lock); > > return ino; > > } > > > > _ > > > > > > I applied the patch and then tried my test again. This time my system > locked up. > Perhaps I should open a new thread for this, since the problem looks > pretty different. > > Dec 14 21:32:55 feargod kernel: process `cat' is using deprecated > sysctl (syscall) net.ipv6.neigh.default.retrans_time; Use > net.ipv6.neigh.default.retran > s_time_ms instead. > Dec 14 21:32:55 feargod kernel: > Dec 14 21:32:55 feargod kernel: ===================================== > Dec 14 21:32:55 feargod kernel: [ BUG: bad unlock balance detected! ] > Dec 14 21:32:55 feargod kernel: ------------------------------------- > Dec 14 21:32:55 feargod kernel: cat/6180 is trying to release lock > (kkkkkkk�����H3��) at: > Dec 14 21:32:55 feargod kernel: [packet_seq_stop+0xe/0x10] > packet_seq_stop+0xe/0x10 > Dec 14 21:32:55 feargod kernel: but there are no more locks to release! > Dec 14 21:32:55 feargod kernel: > Dec 14 21:32:55 feargod kernel: other info that might help us debug this: > Dec 14 21:32:55 feargod kernel: 2 locks held by cat/6180: > Dec 14 21:32:55 feargod kernel: #0: (&p->lock){--..}, at: > [crypto_algapi:seq_read+0x25/0x191c1] seq_read+0x25/0x26f > Dec 14 21:32:55 feargod kernel: #1: > (&net->packet.sklist_lock){-.--}, at: [packet_seq_start+0x14/0x4d] > packet_seq_start+0x14/0x4d > Dec 14 21:32:55 feargod kernel: > Dec 14 21:32:55 feargod kernel: stack backtrace: > Dec 14 21:32:55 feargod kernel: Pid: 6180, comm: cat Not tainted > 2.6.24-rc5-mm1 #27 > Dec 14 21:32:55 feargod kernel: [show_trace_log_lvl+0x12/0x25] > show_trace_log_lvl+0x12/0x25 > Dec 14 21:32:55 feargod kernel: [show_trace+0xd/0x10] show_trace+0xd/0x10 > Dec 14 21:32:55 feargod kernel: [sbp2:dump_stack+0x57/0x17a1] > dump_stack+0x57/0x5f > Dec 14 21:32:55 feargod kernel: > [print_unlock_inbalance_bug+0xce/0xd8] > print_unlock_inbalance_bug+0xce/0xd8 > Dec 14 21:32:55 feargod kernel: [lock_release_non_nested+0x89/0x13b] > lock_release_non_nested+0x89/0x13b > Dec 14 21:32:55 feargod kernel: [mac80211:lock_release+0x186/0x5065] > lock_release+0x186/0x1b6 > Dec 14 21:32:55 feargod kernel: [snd_seq:_read_unlock+0x16/0x3d] > _read_unlock+0x16/0x3d > Dec 14 21:32:55 feargod kernel: [packet_seq_stop+0xe/0x10] > packet_seq_stop+0xe/0x10 > Dec 14 21:32:55 feargod kernel: > [crypto_algapi:seq_read+0x1d4/0x191c1] seq_read+0x1d4/0x26f > Dec 14 21:32:55 feargod kernel: [proc_reg_read+0x60/0x74] > proc_reg_read+0x60/0x74 > Dec 14 21:32:55 feargod kernel: [vfs_read+0x8a/0x106] vfs_read+0x8a/0x106 > Dec 14 21:32:55 feargod kernel: [sys_read+0x3b/0x60] sys_read+0x3b/0x60 > Dec 14 21:32:55 feargod kernel: [sysenter_past_esp+0x6b/0xc1] ugly. At a guess I'd say the code in there has got a hold of some corrupted/garbage memory when it thinks it has a struct sock*. Do you have a recipe which others might use to reproduce this? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/