Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757359Ab3H2U5h (ORCPT ); Thu, 29 Aug 2013 16:57:37 -0400 Received: from mail-de.keymile.com ([195.8.104.250]:53340 "EHLO mail-de.keymile.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756136Ab3H2U5g convert rfc822-to-8bit (ORCPT ); Thu, 29 Aug 2013 16:57:36 -0400 From: "Falauto, Gerlando" To: "Falauto, Gerlando" , John Stultz , "linux-kernel@vger.kernel.org" , Thomas Gleixner , Richard Cochran , Prarit Bhargava CC: "Brunck, Holger" , "Longchamp, Valentin" , "Bigler, Stefan" Date: Thu, 29 Aug 2013 22:56:50 +0200 Subject: kernel deadlock Thread-Topic: kernel deadlock Thread-Index: AQHOpPpIvso+e1NOskuHH261kV92Dw== Message-ID: References: <521F6D06.1040107@keymile.com> In-Reply-To: <521F6D06.1040107@keymile.com> Accept-Language: en-US, de-DE Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US, de-DE Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-OriginalArrivalTime: 29 Aug 2013 20:57:29.0118 (UTC) FILETIME=[5F7DC3E0:01CEA4FA] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4132 Lines: 107 Hi everyone, I ran into the deadlock situation reported at the bottom. Actually, on my latest 3.10 kernel for some reason I don't get the report (the kernel just hangs for some reason), so it took me quite some time to track it down. Once I figured the trigger to the machine hanging was adjtimex(), I reverted everything (between 3.9 to 3.10) that was touching kernel/time/timekeeping/timekeeping.c and kernel/time/ntp.c, I double checked that indeed the problem was not happening anymore, and finally started bisecting, landing on the following offending commit. THEN, and ONLY THEN, did I get the &%""?+"% deadlock report. Do you guys have any ideas what could be wrong and how to fix it? Thank you, Gerlando commit 06c017fdd4dc48451a29ac37fc1db4a3f86b7f40 Author: John Stultz Date: Fri Mar 22 11:37:28 2013 -0700 timekeeping: Hold timekeepering locks in do_adjtimex and hardpps In moving the NTP state to be protected by the timekeeping locks, be sure to acquire the timekeeping locks prior to calling ntp functions. Cc: Thomas Gleixner Cc: Richard Cochran Cc: Prarit Bhargava Signed-off-by: John Stultz ================================= [ INFO: inconsistent lock state ] 3.10.0-04864-g346ecc9-dirty #16 Not tainted --------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. SAKEY/738 [HC0[0]:SC0[0]:HE1:SE1] takes: (timekeeper_lock){?.-...}, at: [] do_adjtimex+0x64/0xbc {IN-HARDIRQ-W} state was registered at: [] __lock_acquire+0xabc/0x1bb8 [] lock_acquire+0xa8/0x15c [] _raw_spin_lock_irqsave+0x50/0x64 [] do_timer+0x2c/0xa54 [] tick_periodic+0x74/0x9c [] tick_handle_periodic+0x18/0x7c [] orion_timer_interrupt+0x24/0x34 [] handle_irq_event_percpu+0x5c/0x300 [] handle_irq_event+0x3c/0x5c [] handle_level_irq+0x8c/0xe8 [] generic_handle_irq+0x30/0x4c [] handle_IRQ+0x30/0x84 [] __irq_svc+0x38/0xa0 [] calibrate_delay+0x350/0x4e4 [] start_kernel+0x23c/0x2c4 [<0000803c>] 0x803c irq event stamp: 32358 hardirqs last enabled at (32357): [] ret_fast_syscall+0x24/0x44 hardirqs last disabled at (32358): [] _raw_spin_lock_irqsave+0x20/0x64 softirqs last enabled at (32160): [] __do_softirq+0x1b8/0x308 softirqs last disabled at (32137): [] irq_exit+0xa0/0xd8 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(timekeeper_lock); lock(timekeeper_lock); *** DEADLOCK *** 1 lock held by SAKEY/738: #0: (timekeeper_lock){?.-...}, at: [] do_adjtimex+0x64/0xbc stack backtrace: CPU: 0 PID: 738 Comm: SAKEY Not tainted 3.10.0-04864-g346ecc9-dirty #16 [] (unwind_backtrace+0x0/0xf0) from [] (show_stack+0x10/0x14) [] (show_stack+0x10/0x14) from [] (print_usage_bug.part.27+0x218/0x280) [] (print_usage_bug.part.27+0x218/0x280) from [] (mark_lock+0x538/0x6bc) [] (mark_lock+0x538/0x6bc) from [] (mark_held_locks+0x90/0x124) [] (mark_held_locks+0x90/0x124) from [] (trace_hardirqs_on_caller+0xa8/0x23c) [] (trace_hardirqs_on_caller+0xa8/0x23c) from [] (_raw_spin_unlock_irq+0x24/0x5c) [] (_raw_spin_unlock_irq+0x24/0x5c) from [] (__do_adjtimex+0x17c/0x65c) [] (__do_adjtimex+0x17c/0x65c) from [] (do_adjtimex+0x84/0xbc) [] (do_adjtimex+0x84/0xbc) from [] (SyS_adjtimex+0x50/0xa8) [] (SyS_adjtimex+0x50/0xa8) from [] (ret_fast_syscall+0x0/0x44) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/