Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752185AbdHRRWt convert rfc822-to-8bit (ORCPT ); Fri, 18 Aug 2017 13:22:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53580 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750984AbdHRRWr (ORCPT ); Fri, 18 Aug 2017 13:22:47 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com B40A78765D Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=longman@redhat.com Subject: Re: [PATCH v2] blktrace: Fix potentail deadlock between delete & sysfs ops To: Bart Van Assche , "rostedt@goodmis.org" Cc: "bfields@fieldses.org" , "mingo@kernel.org" , "jlayton@poochiereds.net" , "linux-block@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "axboe@kernel.dk" , "linux-fsdevel@vger.kernel.org" References: <1502916040-18067-1-git-send-email-longman@redhat.com> <20170817093444.3276f7ab@gandalf.local.home> <20170817171007.1ab33b8f@gandalf.local.home> <20170817173004.263d2891@gandalf.local.home> <5a5d0743-d2db-89c8-59cc-542835baeccf@redhat.com> <1503073304.2622.5.camel@wdc.com> From: Waiman Long Organization: Red Hat Message-ID: <688dc026-6902-8bd3-9d7b-38fec15dc626@redhat.com> Date: Fri, 18 Aug 2017 13:22:44 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <1503073304.2622.5.camel@wdc.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Content-Language: en-US X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 18 Aug 2017 17:22:47 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2482 Lines: 48 On 08/18/2017 12:21 PM, Bart Van Assche wrote: > On Fri, 2017-08-18 at 09:55 -0400, Waiman Long wrote: >> On 08/17/2017 05:30 PM, Steven Rostedt wrote: >>> On Thu, 17 Aug 2017 17:10:07 -0400 >>> Steven Rostedt wrote: >>>> Instead of playing games with taking the lock, the only way this race >>>> is hit, is if the partition is being deleted and the sysfs attribute is >>>> being read at the same time, correct? In that case, just return >>>> -ENODEV, and be done with it. >>> Nevermind that wont work. Too bad there's not a mutex_lock_timeout() >>> that we could use in a loop. It would solve the issue of forward >>> progress with RT tasks, and will break after a timeout in case of >>> deadlock. >> I think it will be useful to have mutex_timed_lock(). RT-mutex does have >> a timed version, so I guess it shouldn't be hard to implement one for >> mutex. I can take a shot at trying to do that. > (just caught up with the entire e-mail thread) > > Sorry Waiman but personally I thoroughly detest loops around mutex_trylock() or > mutex_timed_lock() because such loops are usually used to paper over a problem > instead of fixing the root cause. What I understood from the comment in v1 of your > patch is that bd_mutex is not only held during block device creation and removal > but additionally that bd_mutex is obtained inside sysfs attribute callback methods? > That pattern is guaranteed to lead to deadlocks. Since the block device removal > code waits until all sysfs callback methods have finished there is no need to > protect against block device removal inside the sysfs callback methods. My proposal You are right. We don't really need to take the bd_mutex as the fact that inside the sysfs callback method will guarantee the block device won't go away. > is to split bd_mutex: one global mutex that serializes block device creation and > removal and one mutex per block device that serializes changes to a single block > device. Obtaining the global mutex from inside a block device sysfs callback > function is not safe but obtaining the per-block-device mutex from inside a sysfs > callback function is safe. > > Bart. The bd_mutex we are talking here is already per block device. I am thinking about having a global blktrace mutex that is used to serialize the read and write of blktrace attributes. Since blktrace sysfs files are not supposed to be frequently accessed, having a global lock shouldn't cause any problem. Thanks, Longman