Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754164AbbLJP3H (ORCPT ); Thu, 10 Dec 2015 10:29:07 -0500 Received: from mail-pa0-f43.google.com ([209.85.220.43]:34734 "EHLO mail-pa0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751387AbbLJP3F (ORCPT ); Thu, 10 Dec 2015 10:29:05 -0500 Date: Thu, 10 Dec 2015 10:29:01 -0500 From: Tejun Heo To: Nikolay Borisov Cc: "Linux-Kernel@Vger. Kernel. Org" , SiteGround Operations Subject: Re: corruption causing crash in __queue_work Message-ID: <20151210152901.GR30240@mtj.duckdns.org> References: <566819D8.5090804@kyup.com> <20151209160803.GK30240@mtj.duckdns.org> <56685573.1020805@kyup.com> <20151209162744.GN30240@mtj.duckdns.org> <566945A2.1050208@kyup.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <566945A2.1050208@kyup.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1392 Lines: 34 On Thu, Dec 10, 2015 at 11:28:02AM +0200, Nikolay Borisov wrote: > On 12/09/2015 06:27 PM, Tejun Heo wrote: > > Hello, > > > > On Wed, Dec 09, 2015 at 06:23:15PM +0200, Nikolay Borisov wrote: > >> I think we are seeing this at least daily on at least 1 server (we have > >> multiple servers like that). So adding printk's would likely be the way > >> to go, anything in particular you might be interested in knowing? I see > >> RCU stuff around so might be tricky race condition. > > > > Printing out the workqueue's pointer, name, pwq's pointer, the node > > being installed for and the installed pointer should give us enough > > clues. There's RCU involved but the pointers shouldn't be becoming > > NULLs unless we're installing NULL ptrs. > > So the debug patch has been rolled on 1 server and several more > are in the process, here it is what it prints: > > WQ: ffff88046f00ba00 (events_unbound) old_pwq: (null) new_pwq: ffff88046f00d300 node: 0 ... > Is this format ok? Also I observed the exact same crash > on a machine running 4.1.12 kernel as well. Yeah, I think it can be a good starting point. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/