Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758633Ab3CGPWo (ORCPT ); Thu, 7 Mar 2013 10:22:44 -0500 Received: from mail-la0-f46.google.com ([209.85.215.46]:60353 "EHLO mail-la0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754942Ab3CGPWm (ORCPT ); Thu, 7 Mar 2013 10:22:42 -0500 MIME-Version: 1.0 In-Reply-To: References: <20130305163228.GA12795@htj.dyndns.org> <20130306191408.GN1227@htj.dyndns.org> Date: Thu, 7 Mar 2013 23:22:40 +0800 Message-ID: Subject: Re: workqueue panic in 3.4 kernel From: Lei Wen To: Tejun Heo Cc: linux-kernel@vger.kernel.org, leiwen@marvell.com, wwang27@marvell.com Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3681 Lines: 107 Tejun, On Thu, Mar 7, 2013 at 9:15 AM, Lei Wen wrote: > Hi Tejun, > > On Thu, Mar 7, 2013 at 3:14 AM, Tejun Heo wrote: >> Hello, Lei. >> >> On Wed, Mar 06, 2013 at 10:39:15PM +0800, Lei Wen wrote: >>> We find a race condition as below: >>> CPU0 CPU1 >>> timer interrupt happen >>> __run_timers >>> __run_timers::spin_lock_irq(&base->lock) >>> __run_timers::spin_unlock_irq(&base->lock) >>> >>> __cancel_work_timer >>> >>> __cancel_work_timer::del_timer >>> >>> __cancel_work_timer::wait_on_work >>> >>> __cancel_work_timer::clear_work_data >>> __run_timers::call_timer_fn(timer, fn, data); >>> delayed_work_timer_fn::get_work_cwq >>> __run_timers::spin_lock_irq(&base->lock) >>> >>> It is possible for __cancel_work_timer to be run over cpu1 __BEFORE__ >>> cpu0 is ready to >>> run the timer callback, which is delayed_work_timer_fn in our case. >> >> If del_timer() happens after the timer starts running, del_timer() >> would return NULL and try_to_grab_pending() will be called which will >> return >=0 iff if successfully steals the PENDING bit (ie. it's the >> sole owner of the work item). If del_timer() happens before the timer >> starts running, the timer function would never run. > > If del_timer() happen before __run_timers() is called, while timer irq > already happen, > would it return 1 for the timer is still not detached in __run_timers()? > If it is possible, then we would call try_to_grab_pending(), so that > work->data would > be cleared in this way. > >> >> clear_work_data() happens iff the work item is confirmed to be idle. >> At this point, I'm pretty skeptical this is a bug in workqueue itself >> and strongly suggest looking at the crashing workqueue user. > > Also I am not very familiar with workqueue mechanism, how many place > in kernel would > clear the work->data beside the clear_work_data()? > > From the memory, I cannot find any hint for work structure being destroyed. > So the only possible seems to me is the work->data be set by someone on purpose. > > crash> struct delayed_work 0xbf03d544 -x > struct delayed_work { > work = { > data = { > counter = 0x300 > }, > entry = { > next = 0xbf03d548, > prev = 0xbf03d548 > }, > func = 0xbf014b00 > }, > timer = { > entry = { > next = 0x0, > prev = 0x200200 > }, > expires = 0x12b638b, > base = 0xc0844e01, > function = 0xc014c7a0 , > data = 0xbf03d544, > slack = 0xffffffff, > start_pid = 0xffffffff, > start_site = 0x0, > start_comm = "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000" > } > } I captured a trace log, which shows my previous suspicion is true: __cancel_work_timer is called before delayed_work_timer_fn, so that work->data is cleared. And when __cancel_work_timer is called, the timer is still pending, so del_timer would return 1, thus no try_to_grab_pending would be called. But it is very strange that in __run_timers, it still get the same timer. Then its callback, delayed_work_timer_fn, would be called, which cause the issue. The detach_timer in __cancel_work_timer should already move the timer from all list, am I right? Could it happen for the timer_list be queued twice, like queue over two cpu? If not, how could it happen? Thanks, Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/