Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752517AbbLKTOE (ORCPT ); Fri, 11 Dec 2015 14:14:04 -0500 Received: from mx1.redhat.com ([209.132.183.28]:57334 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751276AbbLKTOC (ORCPT ); Fri, 11 Dec 2015 14:14:02 -0500 Date: Fri, 11 Dec 2015 14:14:01 -0500 From: Mike Snitzer To: Nikolay Borisov Cc: Tejun Heo , Nikolay Borisov , "Linux-Kernel@Vger. Kernel. Org" , SiteGround Operations , Alasdair Kergon , device-mapper development Subject: Re: corruption causing crash in __queue_work Message-ID: <20151211191400.GA24229@redhat.com> References: <566819D8.5090804@kyup.com> <20151209160803.GK30240@mtj.duckdns.org> <56685573.1020805@kyup.com> <20151209162744.GN30240@mtj.duckdns.org> <566945A2.1050208@kyup.com> <20151210152901.GR30240@mtj.duckdns.org> <566AF262.8050009@kyup.com> <20151211170805.GT30240@mtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1669 Lines: 39 On Fri, Dec 11 2015 at 1:00pm -0500, Nikolay Borisov wrote: > On Fri, Dec 11, 2015 at 7:08 PM, Tejun Heo wrote: > > > > Hmmm... No idea why it didn't show up in the debug log but the only > > way a workqueue could be in the above state is either it got > > explicitly destroyed or somehow pwq refcnting is messed up, in both > > cases it should have shown up in the log. > > > > cc'ing dm people. Is there any chance dm-thinp could be using > > workqueue after destroying it? Not that I'm aware of. But never say never? Plus I'd think we'd see other dm-thinp specific use-after-free issues aside from the thin-pool's workqueue. > In __pool_destroy in dm-thin.c I don't see a call to > cancel_delayed_work before destroying the workqueue. Is it possible > that this is the causeI Cannot see how, __pool_destroy()'s destroy_workqueue() would spew a bunch of WARN_ONs (and the wq wouldn't be destroyed) if the workqueue had outstanding work. __pool_destroy() is called once the thin-pool's ref count drops to 0 (see __pool_dec which is called when the thin-pool is removed -- e.g. with 'dmsetup remove'). This code is only reachable when nothing else is using the thin-pool. And the thin-pool is only able to be removed if all thin devices that depend on it have first been removed. And each individual thin device waits for all outstanding IO before they can be removed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/