Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753095Ab3FJUeY (ORCPT ); Mon, 10 Jun 2013 16:34:24 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:51772 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752316Ab3FJUeW convert rfc822-to-8bit (ORCPT ); Mon, 10 Jun 2013 16:34:22 -0400 Date: Mon, 10 Jun 2013 13:33:55 -0700 From: "Paul E. McKenney" To: Dave Jones , Linux Kernel , fweisbec@gmail.com, rostedt@goodmis.org Subject: Re: Another RCU trace. (3.10-rc5) Message-ID: <20130610203355.GO5146@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20130610195145.GA26068@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: 8BIT In-Reply-To: <20130610195145.GA26068@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13061020-4834-0000-0000-000007EE4C82 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8430 Lines: 74 On Mon, Jun 10, 2013 at 03:51:45PM -0400, Dave Jones wrote: > Something in this code still isn't right in rc5. > I got a pretty mangled trace (and then total lockup) pretty quickly.. > > [ 1300.705212] =============================== > [ 1300.707514] [ INFO: suspicious RCU usage. ] > [ 1300.709794] 3.10.0-rc5+ #6 Not tainted > [ 1300.712064] ------------------------------- > [ 1300.714363] include/linux/rcupdate.h:780 rcu_read_lock() used illegally while idle! > [ 1300.717516] > [ 1300.717516] other info that might help us debug this: > [ 1300.717516] > [ 1300.723450] > [ 1300.723450] RCU used illegally from idle CPU! > [ 1300.723450] rcu_scheduler_active = 1, debug_locks = 0 > [ 1300.730124] RCU used illegally from extended quiescent state! > [ 1300.732971] 2 locks held by kworker/0:0/12904: > [ 1300.735395] #0: (&(&(&p->vtime_seqlock)->lock)->rlock){-.-.-.}, at: [] vtime_account_system.part.1+0x1e/0x50 > [ 1300.740229] #1: (rcu_read_lock){.+.+..}, at: [] cpuacct_account_field+0x5/0x200 > [ 1300.744305] > [ 1300.744305] stack backtrace: > [ 1300.747817] CPU: 0 PID: 12904 Comm: kworker/0:0 Not tainted 3.10.0-rc5+ #6 > [ 1300.756486] ------------[ cut here ]------------ > [ 1300.759463] WARNING: at kernel/sched/cputime.c:652 get_vtime_delta+0x64/0x80() > [ 1300.780868] CPU: 0 PID: 12904 Comm: kworker/0:0 Not tainted 3.10.0-rc5+ #6 > [ 1300.783521] ------------[ cut here ]------------ > [ 1300.784848] WARNING: at kernel/rcutree.c:502 rcu_eqs_exit_common+0x357/0x380() > [ 1300.792216] CPU: 0 PID: 12904 Comm: kworker/0:0 Not tainted 3.10.0-rc5+ #6 > [ 1300.795211] ------------[ cut here ]------------ > [ 1300.796734] WARNING: at kernel/sched/cputime.c:652 get_vtime_delta+0x64/0x80() > [ 1300.804965] CPU: 0 PID: 12904 Comm: kworker/0:0 Not tainted 3.10.0-rc5+ #6 > [ 1300.808270] ------------[ cut here ]------------ > [ 1300.809921] WARNING: at kernel/rcutree.c:502 rcu_eqs_exit_common+0x357/0x380() > [ 1300.811591] Modules linked in: tun rfcomm fuse bnep hidp ipt_ULOG nfnetlink scsi_transport_iscsi can_raw af_rxrpc rds nfc can_bcm netrom decnet rose af_key af_802154 can appletalk x25 bluetooth llc2 rfkill pppoe caif_socket pppox caif ppp_generic slhc irda ipx p8023 psnap p8022 llc crc_ccitt atm ax25 phonet nfsv3 nfs_acl nfs lockd sunrpc fscache coretemp hwmon kvm_intel snd_hda_codec_realtek kvm crc32c_intel snd_hda_codec_hdmi ghash_clmulni_intel microcode pcspkr snd_hda_intel snd_hda_codec snd_hwdep snd_seq e1000e snd_seq_device usb_debug snd_pcm ptp snd_page_alloc pps_core snd_timer snd soundcore xfs libcrc32c > [ 1300.818924] CPU: 0 PID: 12904 Comm: kworker/0:0 Not tainted 3.10.0-rc5+ #6 > [ 1300.822581] ------------[ cut here ]------------ > [ 1300.824419] WARNING: at kernel/sched/cputime.c:652 get_vtime_delta+0x64/0x80() > [ 1300.826271] Modules linked in: tun rfcomm fuse bnep hidp ipt_ULOG nfnetlink scsi_transport_iscsi can_raw af_rxrpc rds nfc can_bcm netrom decnet rose af_key af_802154 can appletalk x25 bluetooth llc2 rfkill pppoe caif_socket pppox caif ppp_generic slhc irda ipx p8023 psnap p8022 llc crc_ccitt atm ax25 phonet nfsv3 nfs_acl nfs lockd sunrpc fscache coretemp hwmon kvm_intel snd_hda_codec_realtek kvm crc32c_intel snd_hda_codec_hdmi ghash_clmulni_intel microcode pcspkr snd_hda_intel snd_hda_codec snd_hwdep snd_seq e1000e snd_seq_device usb_debug snd_pcm ptp snd_page_alloc pps_core snd_timer snd soundcore xfs libcrc32c > [ 1300.834252] CPU: 0 PID: 12904 Comm: kworker/0:0 Not tainted 3.10.0-rc5+ #6 > [ 1300.838244] ------------[ cut here ]------------ > [ 1300.840231] WARNING: at kernel/rcutree.c:502 rcu_eqs_exit_common+0x357/0x380() > [ 1300.895079] WARNING: at kernel/sched/cputime.c:652 get_v[ 1300.8 af_802154 can appletalk x25 bluetooth llc2 rfkill pppoe nfs_acl nfs lockd sunrpc fscache coretemp hwmon kvm_intel snd_h snd_hda_intel snd_hda_codec snd_hwdep snd_seq e1000e sn > [ 1300.907729] CPU: 0 PID: 12904 Comm: kworker/0:0 Not > [ 1300.914988] ffffffff8104a1b1 000000000b89a53f ffff > [ 1300.917385] 0000000000000008 ffffffff81101b57 ffff > [ 1300.919760] Call Trace: > [ 1300.922120] [] dump_stack+0x19/0x1b > [ 1300.924429] [] warn_slowpath_common+0x61/0x80 > [ 1300.926721] [] ? rcu_eqs_exit_common+0x357/0x380 > [ 1300.928993] [] warn_slowpath_null+0x1a/0x20 > [ 1300.931243] [] get_vtime_delta+0x64/0x80 > [ 1300.933487] [] vtime_account_user+0x29/0x[ 1300.935702] [] user_exit+0[ 1300.937882] [] do_page_fau[ 1300.940037] [] page_fault+0x22/0x3[ 1300.942158] [] ? rcu_eqs_e[ 1300.944270] [] ? copy_user[ 1300.946365] [] ? __probe_k[ 1300.948437] [] print_worke[ 1300.950502] [] dump_stack_[ 1300.952543] [] dum[ 1300.954546] [] warn_slowpa[ 1300.956525] [] warn_slowpa[ 1300.958471] [] rcu[ 1300.960401] [] ? get_vtime[ 1300.962309] [] rcu[ 1300.964196] [] rcu_user_ex[ 1300.966071] [] use[ 1300.967911] [] do_[ 1300.969729] [] pag[ 1300.971510] [] ? get_vtime[ 1300.973270] [] ? c[ 1300.975014] [ 1300.989735] [ 1300.991250] [] ? r[ 1300.995698] [ > and then total lockup. > > > I saw some of Steven's patches get merged on Friday, is there anything else > outstanding that didn't make it in yet that I could test ? > Or is this another new bug ? I have three fixes queued up at: git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/urgent Kind of hard to tell whether they are relevant given the interleaved stack traces, but can't hurt to try them out. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/