Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751781AbdHRQWi (ORCPT ); Fri, 18 Aug 2017 12:22:38 -0400 Received: from esa2.hgst.iphmx.com ([68.232.143.124]:9892 "EHLO esa2.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750971AbdHRQWg (ORCPT ); Fri, 18 Aug 2017 12:22:36 -0400 X-IronPort-AV: E=Sophos;i="5.41,393,1498492800"; d="scan'208";a="139845991" From: Bart Van Assche To: "longman@redhat.com" , "rostedt@goodmis.org" CC: "bfields@fieldses.org" , "mingo@kernel.org" , "jlayton@poochiereds.net" , "linux-block@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "axboe@kernel.dk" , "linux-fsdevel@vger.kernel.org" Subject: Re: [PATCH v2] blktrace: Fix potentail deadlock between delete & sysfs ops Thread-Topic: [PATCH v2] blktrace: Fix potentail deadlock between delete & sysfs ops Thread-Index: AQHTFtAejxW6mY/MvEitlZ7OnTs8A6KIjYsAgAAveoCAAE/CgIAABZMAgAETdQCAACi6AA== Date: Fri, 18 Aug 2017 16:21:46 +0000 Message-ID: <1503073304.2622.5.camel@wdc.com> References: <1502916040-18067-1-git-send-email-longman@redhat.com> <20170817093444.3276f7ab@gandalf.local.home> <20170817171007.1ab33b8f@gandalf.local.home> <20170817173004.263d2891@gandalf.local.home> <5a5d0743-d2db-89c8-59cc-542835baeccf@redhat.com> In-Reply-To: <5a5d0743-d2db-89c8-59cc-542835baeccf@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Bart.VanAssche@wdc.com; x-originating-ip: [63.163.107.100] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;CY1PR0401MB1370;20:P6jRxeWDXAcGaou3IXtCKdUJw/IgCUP4ZyVicYq+YekFYTyYmb7VA8Og1L3eiNfrgo8DoSNLLDHyqgzYo4jZLPK4L47cQyUC1ID0DjZ6jzkbVZx92KoMoyWiRu0aeG3Ix1WmxKtPyy8Zfs9OJcqrMVMw6kcF3IxsmKLXTeBH6bQ= x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: bd7974d6-3e8b-4ad9-57f7-08d4e655390d x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(48565401081)(300000503095)(300135400095)(2017052603031)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:CY1PR0401MB1370; x-ms-traffictypediagnostic: CY1PR0401MB1370: wdcipoutbound: EOP-TRUE x-exchange-antispam-report-test: UriScan:(17755550239193); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(3002001)(100000703101)(100105400095)(10201501046)(93006095)(93001095)(6055026)(6041248)(20161123555025)(20161123564025)(20161123558100)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:CY1PR0401MB1370;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:CY1PR0401MB1370; x-forefront-prvs: 040359335D x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(39860400002)(189002)(377454003)(199003)(24454002)(377424004)(53936002)(478600001)(14454004)(53546010)(6436002)(72206003)(33646002)(189998001)(7736002)(5660300001)(86362001)(6486002)(77096006)(6506006)(229853002)(305945005)(97736004)(2950100002)(3660700001)(81166006)(81156014)(76176999)(54356999)(50986999)(101416001)(105586002)(36756003)(8936002)(2501003)(68736007)(99286003)(106356001)(54906002)(25786009)(6512007)(8676002)(66066001)(2900100001)(2906002)(6246003)(93886005)(4326008)(3280700002)(561944003)(3846002)(102836003)(103116003)(6116002);DIR:OUT;SFP:1102;SCL:1;SRVR:CY1PR0401MB1370;H:CY1PR0401MB1536.namprd04.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <294544477A4E8B4DAC47AF44D42F4FCC@namprd04.prod.outlook.com> MIME-Version: 1.0 X-OriginatorOrg: wdc.com X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Aug 2017 16:21:46.8148 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b61c8803-16f3-4c35-9b17-6f65f441df86 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0401MB1370 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id v7IGMh2c012469 Content-Length: 1932 Lines: 35 On Fri, 2017-08-18 at 09:55 -0400, Waiman Long wrote: > On 08/17/2017 05:30 PM, Steven Rostedt wrote: > > On Thu, 17 Aug 2017 17:10:07 -0400 > > Steven Rostedt wrote: > > > Instead of playing games with taking the lock, the only way this race > > > is hit, is if the partition is being deleted and the sysfs attribute is > > > being read at the same time, correct? In that case, just return > > > -ENODEV, and be done with it. > > > > Nevermind that wont work. Too bad there's not a mutex_lock_timeout() > > that we could use in a loop. It would solve the issue of forward > > progress with RT tasks, and will break after a timeout in case of > > deadlock. > > I think it will be useful to have mutex_timed_lock(). RT-mutex does have > a timed version, so I guess it shouldn't be hard to implement one for > mutex. I can take a shot at trying to do that. (just caught up with the entire e-mail thread) Sorry Waiman but personally I thoroughly detest loops around mutex_trylock() or mutex_timed_lock() because such loops are usually used to paper over a problem instead of fixing the root cause. What I understood from the comment in v1 of your patch is that bd_mutex is not only held during block device creation and removal but additionally that bd_mutex is obtained inside sysfs attribute callback methods? That pattern is guaranteed to lead to deadlocks. Since the block device removal code waits until all sysfs callback methods have finished there is no need to protect against block device removal inside the sysfs callback methods. My proposal is to split bd_mutex: one global mutex that serializes block device creation and removal and one mutex per block device that serializes changes to a single block device. Obtaining the global mutex from inside a block device sysfs callback function is not safe but obtaining the per-block-device mutex from inside a sysfs callback function is safe. Bart.