Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754257Ab2HBN0n (ORCPT ); Thu, 2 Aug 2012 09:26:43 -0400 Received: from mx1.redhat.com ([209.132.183.28]:14142 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753424Ab2HBN0l (ORCPT ); Thu, 2 Aug 2012 09:26:41 -0400 Date: Thu, 2 Aug 2012 09:26:05 -0400 From: Dave Jones To: Sasha Levin Cc: Ingo Molnar , Peter Zijlstra , Andrew Morton , tglx@linutronix.de, "linux-kernel@vger.kernel.org" Subject: Re: tasklist_lock lockdep warnings on 3.6 Message-ID: <20120802132605.GA27474@redhat.com> Mail-Followup-To: Dave Jones , Sasha Levin , Ingo Molnar , Peter Zijlstra , Andrew Morton , tglx@linutronix.de, "linux-kernel@vger.kernel.org" References: <500F954B.4010206@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 18825 Lines: 280 On Mon, Jul 30, 2012 at 04:50:46PM +0200, Sasha Levin wrote: > ping? > > I'm still seeing this on linux-next. Likewise, except I'm seeing it in Linus' tree since shortly after this merge window began. (https://lkml.org/lkml/2012/7/24/443). I've spent all of this last week doing multiple attempts at bisecting this without success. (It doesn't always show up, and even after running the test for over an hour before marking a build 'good') Anyone ? Dave > On Wed, Jul 25, 2012 at 8:42 AM, Sasha Levin wrote: > > Hi all, > > > > I've stumbled on the following while fuzzing with trinity inside a KVM tools guest on 3.6 kernel: > > > > [ 250.495512] ====================================================== > > [ 250.496020] [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] > > [ 250.496020] 3.5.0-sasha-01646-g39c0dda #269 Tainted: G W > > [ 250.496020] ------------------------------------------------------ > > [ 250.496020] trinity-child15/6956 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: > > [ 250.496020] (tasklist_lock){.+.+..}, at: [] posix_cpu_timer_del+0x2a/0x110 > > [ 250.496020] > > [ 250.496020] and this task is already holding: > > [ 250.496020] (&(&new_timer->it_lock)->rlock){-.-...}, at: [] exit_itimers+0x50/0x140 > > [ 250.496020] which would create a new lock dependency: > > [ 250.496020] (&(&new_timer->it_lock)->rlock){-.-...} -> (tasklist_lock){.+.+..} > > [ 250.496020] > > [ 250.496020] but this new dependency connects a HARDIRQ-irq-safe lock: > > [ 250.496020] (&(&new_timer->it_lock)->rlock){-.-...} > > ... which became HARDIRQ-irq-safe at: > > [ 250.496020] [] mark_irqflags+0x7c/0x190 > > [ 250.496020] [] __lock_acquire+0x936/0xb60 > > [ 250.496020] [] lock_acquire+0x1ca/0x270 > > [ 250.496020] [] _raw_spin_lock_irqsave+0x79/0xc0 > > [ 250.496020] [] posix_timer_fn+0x32/0xd0 > > [ 250.496020] [] __run_hrtimer+0x27e/0x4d0 > > [ 250.496020] [] hrtimer_interrupt+0x119/0x220 > > [ 250.496020] [] smp_apic_timer_interrupt+0x85/0xa0 > > [ 250.496020] [] apic_timer_interrupt+0x6f/0x80 > > [ 250.496020] [] rcu_lockdep_current_cpu_online+0x6e/0xa0 > > [ 250.496020] [] proc_sys_compare+0x7e/0x130 > > [ 250.496020] [] __d_lookup+0x18c/0x2e0 > > [ 250.496020] [] lookup_fast+0x136/0x240 > > [ 250.496020] [] link_path_walk+0x22c/0x8f0 > > [ 250.496020] [] path_lookupat+0x57/0x720 > > [ 250.496020] [] do_path_lookup+0x2c/0xc0 > > [ 250.496020] [] user_path_at_empty+0x64/0xa0 > > [ 250.496020] [] user_path_at+0xc/0x10 > > [ 250.496020] [] sys_chdir+0x1e/0x70 > > [ 250.496020] [] system_call_fastpath+0x1a/0x1f > > [ 250.496020] > > [ 250.496020] to a HARDIRQ-irq-unsafe lock: > > [ 250.496020] (&(&p->alloc_lock)->rlock){+.+...} > > ... which became HARDIRQ-irq-unsafe at: > > [ 250.496020] ... [] mark_irqflags+0x100/0x190 > > [ 250.496020] [] __lock_acquire+0x936/0xb60 > > [ 250.496020] [] lock_acquire+0x1ca/0x270 > > [ 250.496020] [] _raw_spin_lock+0x3b/0x70 > > [ 250.496020] [] set_task_comm+0x31/0x1c0 > > [ 250.496020] [] kthreadd+0x2c/0x170 > > [ 250.496020] [] kernel_thread_helper+0x4/0x10 > > [ 250.496020] > > [ 250.496020] other info that might help us debug this: > > [ 250.496020] > > [ 250.496020] Chain exists of: > > &(&new_timer->it_lock)->rlock --> tasklist_lock --> &(&p->alloc_lock)->rlock > > > > [ 250.496020] Possible interrupt unsafe locking scenario: > > [ 250.496020] > > [ 250.496020] CPU0 CPU1 > > [ 250.496020] ---- ---- > > [ 250.496020] lock(&(&p->alloc_lock)->rlock); > > [ 250.496020] local_irq_disable(); > > [ 250.496020] lock(&(&new_timer->it_lock)->rlock); > > [ 250.496020] lock(tasklist_lock); > > [ 250.496020] > > [ 250.496020] lock(&(&new_timer->it_lock)->rlock); > > [ 250.496020] > > [ 250.496020] *** DEADLOCK *** > > [ 250.496020] > > [ 250.496020] 1 lock held by trinity-child15/6956: > > [ 250.496020] #0: (&(&new_timer->it_lock)->rlock){-.-...}, at: [] exit_itimers+0x50/0x140 > > [ 250.496020] > > the dependencies between HARDIRQ-irq-safe lock and the holding lock: > > [ 250.496020] -> (&(&new_timer->it_lock)->rlock){-.-...} ops: 6221 { > > [ 250.496020] IN-HARDIRQ-W at: > > [ 250.496020] [] mark_irqflags+0x7c/0x190 > > [ 250.496020] [] __lock_acquire+0x936/0xb60 > > [ 250.496020] [] lock_acquire+0x1ca/0x270 > > [ 250.496020] [] _raw_spin_lock_irqsave+0x79/0xc0 > > [ 250.496020] [] posix_timer_fn+0x32/0xd0 > > [ 250.496020] [] __run_hrtimer+0x27e/0x4d0 > > [ 250.496020] [] hrtimer_interrupt+0x119/0x220 > > [ 250.496020] [] smp_apic_timer_interrupt+0x85/0xa0 > > [ 250.496020] [] apic_timer_interrupt+0x6f/0x80 > > [ 250.496020] [] rcu_lockdep_current_cpu_online+0x6e/0xa0 > > [ 250.496020] [] proc_sys_compare+0x7e/0x130 > > [ 250.496020] [] __d_lookup+0x18c/0x2e0 > > [ 250.496020] [] lookup_fast+0x136/0x240 > > [ 250.496020] [] link_path_walk+0x22c/0x8f0 > > [ 250.496020] [] path_lookupat+0x57/0x720 > > [ 250.496020] [] do_path_lookup+0x2c/0xc0 > > [ 250.496020] [] user_path_at_empty+0x64/0xa0 > > [ 250.496020] [] user_path_at+0xc/0x10 > > [ 250.496020] [] sys_chdir+0x1e/0x70 > > [ 250.496020] [] system_call_fastpath+0x1a/0x1f > > [ 250.496020] IN-SOFTIRQ-W at: > > [ 250.496020] [] mark_irqflags+0xa7/0x190 > > [ 250.496020] [] __lock_acquire+0x936/0xb60 > > [ 250.496020] [] lock_acquire+0x1ca/0x270 > > [ 250.496020] [] _raw_spin_lock_irqsave+0x79/0xc0 > > [ 250.496020] [] posix_timer_fn+0x32/0xd0 > > [ 250.496020] [] __run_hrtimer+0x27e/0x4d0 > > [ 250.496020] [] hrtimer_interrupt+0x119/0x220 > > [ 250.561155] [] smp_apic_timer_interrupt+0x85/0xa0 > > [ 250.561155] [] apic_timer_interrupt+0x6f/0x80 > > [ 250.561155] [] rcu_start_gp+0x467/0x500 > > [ 250.561155] [] rcu_report_qs_rsp+0x223/0x290 > > [ 250.561155] [] rcu_report_qs_rnp+0x251/0x2b0 > > [ 250.561155] [] rcu_check_quiescent_state+0x9b/0xc0 > > [ 250.561155] [] __rcu_process_callbacks+0x86/0xe0 > > [ 250.561155] [] rcu_process_callbacks+0xd8/0x210 > > [ 250.561155] [] __do_softirq+0x1c5/0x450 > > [ 250.561155] [] run_ksoftirqd+0x105/0x250 > > [ 250.561155] [] kthread+0xad/0xc0 > > [ 250.561155] [] kernel_thread_helper+0x4/0x10 > > [ 250.561155] INITIAL USE at: > > [ 250.561155] [] __lock_acquire+0x95c/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_spin_lock_irqsave+0x79/0xc0 > > [ 250.561155] [] __lock_timer+0xa7/0x1a0 > > [ 250.561155] [] sys_timer_getoverrun+0x12/0x50 > > [ 250.561155] [] system_call_fastpath+0x1a/0x1f > > [ 250.561155] } > > [ 250.561155] ... key at: [] __key.29517+0x0/0x8 > > [ 250.561155] ... acquired at: > > [ 250.561155] [] check_usage+0x1bd/0x1e0 > > [ 250.561155] [] check_irq_usage+0x6a/0xe0 > > [ 250.561155] [] check_prev_add+0x14b/0x640 > > [ 250.561155] [] check_prevs_add+0xba/0x1a0 > > [ 250.561155] [] validate_chain.isra.22+0x6a0/0x7b0 > > [ 250.561155] [] __lock_acquire+0xaa1/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_read_lock+0x3e/0x80 > > [ 250.561155] [] posix_cpu_timer_del+0x2a/0x110 > > [ 250.561155] [] exit_itimers+0xcd/0x140 > > [ 250.561155] [] do_exit+0x1b8/0x580 > > [ 250.561155] [] do_group_exit+0x8a/0xc0 > > [ 250.561155] [] sys_exit_group+0x12/0x20 > > [ 250.561155] [] system_call_fastpath+0x1a/0x1f > > [ 250.561155] > > [ 250.561155] > > [ 250.561155] the dependencies between the lock to be acquired and HARDIRQ-irq-unsafe lock: > > [ 250.561155] -> (&(&p->alloc_lock)->rlock){+.+...} ops: 921205 { > > [ 250.561155] HARDIRQ-ON-W at: > > [ 250.561155] [] mark_irqflags+0x100/0x190 > > [ 250.561155] [] __lock_acquire+0x936/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_spin_lock+0x3b/0x70 > > [ 250.561155] [] set_task_comm+0x31/0x1c0 > > [ 250.561155] [] kthreadd+0x2c/0x170 > > [ 250.561155] [] kernel_thread_helper+0x4/0x10 > > [ 250.561155] SOFTIRQ-ON-W at: > > [ 250.561155] [] mark_irqflags+0x123/0x190 > > [ 250.561155] [] __lock_acquire+0x936/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_spin_lock+0x3b/0x70 > > [ 250.561155] [] set_task_comm+0x31/0x1c0 > > [ 250.561155] [] kthreadd+0x2c/0x170 > > [ 250.561155] [] kernel_thread_helper+0x4/0x10 > > [ 250.561155] INITIAL USE at: > > [ 250.561155] [] __lock_acquire+0x95c/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_spin_lock+0x3b/0x70 > > [ 250.561155] [] set_task_comm+0x31/0x1c0 > > [ 250.561155] [] kthreadd+0x2c/0x170 > > [ 250.561155] [] kernel_thread_helper+0x4/0x10 > > [ 250.561155] } > > [ 250.561155] ... key at: [] __key.45219+0x0/0x8 > > [ 250.561155] ... acquired at: > > [ 250.561155] [] check_prevs_add+0xba/0x1a0 > > [ 250.561155] [] validate_chain.isra.22+0x6a0/0x7b0 > > [ 250.561155] [] __lock_acquire+0xaa1/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_spin_lock+0x3b/0x70 > > [ 250.561155] [] keyctl_session_to_parent+0x105/0x3f0 > > [ 250.561155] [] sys_keyctl+0x155/0x1a0 > > [ 250.561155] [] system_call_fastpath+0x1a/0x1f > > [ 250.561155] > > [ 250.561155] -> (tasklist_lock){.+.+..} ops: 81487 { > > [ 250.561155] HARDIRQ-ON-R at: > > [ 250.561155] [] mark_irqflags+0xcc/0x190 > > [ 250.561155] [] __lock_acquire+0x936/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_read_lock+0x3e/0x80 > > [ 250.561155] [] do_wait+0x178/0x3b0 > > [ 250.561155] [] sys_wait4+0xbb/0xe0 > > [ 250.561155] [] wait_for_helper+0x80/0xa0 > > [ 250.561155] [] kernel_thread_helper+0x4/0x10 > > [ 250.561155] SOFTIRQ-ON-R at: > > [ 250.561155] [] mark_irqflags+0x123/0x190 > > [ 250.561155] [] __lock_acquire+0x936/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_read_lock+0x3e/0x80 > > [ 250.561155] [] do_wait+0x178/0x3b0 > > [ 250.561155] [] sys_wait4+0xbb/0xe0 > > [ 250.561155] [] wait_for_helper+0x80/0xa0 > > [ 250.561155] [] kernel_thread_helper+0x4/0x10 > > [ 250.561155] INITIAL USE at: > > [ 250.561155] [] __lock_acquire+0x95c/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_write_lock_irq+0x59/0x90 > > [ 250.561155] [] copy_process+0xb0d/0x11a0 > > [ 250.561155] [] do_fork+0x104/0x3d0 > > [ 250.561155] [] kernel_thread+0x71/0x80 > > [ 250.561155] [] rest_init+0x21/0x144 > > [ 250.561155] [] start_kernel+0x378/0x385 > > [ 250.561155] [] x86_64_start_reservations+0x101/0x105 > > [ 250.561155] [] x86_64_start_kernel+0xe5/0xf4 > > [ 250.561155] } > > [ 250.561155] ... key at: [] tasklist_lock+0x18/0x80 > > [ 250.561155] ... acquired at: > > [ 250.561155] [] check_usage+0x1bd/0x1e0 > > [ 250.561155] [] check_irq_usage+0x6a/0xe0 > > [ 250.561155] [] check_prev_add+0x14b/0x640 > > [ 250.561155] [] check_prevs_add+0xba/0x1a0 > > [ 250.561155] [] validate_chain.isra.22+0x6a0/0x7b0 > > [ 250.561155] [] __lock_acquire+0xaa1/0xb60 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] _raw_read_lock+0x3e/0x80 > > [ 250.561155] [] posix_cpu_timer_del+0x2a/0x110 > > [ 250.561155] [] exit_itimers+0xcd/0x140 > > [ 250.561155] [] do_exit+0x1b8/0x580 > > [ 250.561155] [] do_group_exit+0x8a/0xc0 > > [ 250.561155] [] sys_exit_group+0x12/0x20 > > [ 250.561155] [] system_call_fastpath+0x1a/0x1f > > [ 250.561155] > > [ 250.561155] > > [ 250.561155] stack backtrace: > > [ 250.561155] Pid: 6956, comm: trinity-child15 Tainted: G W 3.5.0-sasha-01646-g39c0dda #269 > > [ 250.561155] Call Trace: > > [ 250.561155] [] print_bad_irq_dependency+0x2ea/0x310 > > [ 250.561155] [] check_usage+0x1bd/0x1e0 > > [ 250.561155] [] ? pvclock_clocksource_read+0x55/0xe0 > > [ 250.561155] [] check_irq_usage+0x6a/0xe0 > > [ 250.561155] [] check_prev_add+0x14b/0x640 > > [ 250.561155] [] ? pvclock_clocksource_read+0x55/0xe0 > > [ 250.561155] [] check_prevs_add+0xba/0x1a0 > > [ 250.561155] [] ? sched_clock+0x15/0x20 > > [ 250.561155] [] validate_chain.isra.22+0x6a0/0x7b0 > > [ 250.561155] [] __lock_acquire+0xaa1/0xb60 > > [ 250.561155] [] ? __lock_acquire+0xac5/0xb60 > > [ 250.561155] [] ? _raw_spin_unlock_irqrestore+0x7c/0xa0 > > [ 250.561155] [] lock_acquire+0x1ca/0x270 > > [ 250.561155] [] ? posix_cpu_timer_del+0x2a/0x110 > > [ 250.561155] [] ? __lock_acquired+0x2ce/0x360 > > [ 250.561155] [] ? exit_itimers+0x50/0x140 > > [ 250.561155] [] _raw_read_lock+0x3e/0x80 > > [ 250.561155] [] ? posix_cpu_timer_del+0x2a/0x110 > > [ 250.561155] [] posix_cpu_timer_del+0x2a/0x110 > > [ 250.561155] [] exit_itimers+0xcd/0x140 > > [ 250.561155] [] do_exit+0x1b8/0x580 > > [ 250.561155] [] do_group_exit+0x8a/0xc0 > > [ 250.561155] [] sys_exit_group+0x12/0x20 > > [ 250.561155] [] system_call_fastpath+0x1a/0x1f ---end quoted text--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/