Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752783AbdLHFgg (ORCPT ); Fri, 8 Dec 2017 00:36:36 -0500 Received: from imap1.codethink.co.uk ([176.9.8.82]:46995 "EHLO imap1.codethink.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751275AbdLHFge (ORCPT ); Fri, 8 Dec 2017 00:36:34 -0500 Message-ID: <1512711385.18523.250.camel@codethink.co.uk> Subject: Re: [PATCH 4.4 13/16] ocfs2: should wait dio before inode lock in ocfs2_setattr() From: Ben Hutchings To: alex chen Cc: Greg Kroah-Hartman , piaojun , Joseph Qi , "stable@vger.kernel.org" , Changwei Ge , Mark Fasheh , Joel Becker , Junxiao Bi , Andrew Morton , Linus Torvalds , LKML Date: Fri, 08 Dec 2017 05:36:25 +0000 In-Reply-To: <5A2A0F11.2090908@huawei.com> References: <20171122101110.784746358@linuxfoundation.org> <20171122101111.411869812@linuxfoundation.org> <1512488994.18523.173.camel@codethink.co.uk> <5A2741B2.4040509@huawei.com> <1512671158.18523.187.camel@codethink.co.uk> <5A29DF39.9080305@huawei.com> <1512699985.18523.219.camel@codethink.co.uk> <5A2A0F11.2090908@huawei.com> Organization: Codethink Ltd. Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.6-1+deb9u1 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3254 Lines: 90 On Fri, 2017-12-08 at 12:03 +0800, alex chen wrote: > > On 2017/12/8 10:26, Ben Hutchings wrote: > > On Fri, 2017-12-08 at 08:39 +0800, alex chen wrote: > > > > > > On 2017/12/8 2:25, Ben Hutchings wrote: > > > > On Wed, 2017-12-06 at 09:02 +0800, alex chen wrote: > > > > > Hi Ben, > > > > > > > > > > Thanks for your reply. > > > > > > > > > > On 2017/12/5 23:49, Ben Hutchings wrote: > > > > > > On Wed, 2017-11-22 at 11:12 +0100, Greg Kroah-Hartman wrote: > > > > > > > 4.4-stable review patch.  If anyone has any objections, > > > > > > > please let me know. > > > > > > > > > > > > > > ------------------ > > > > > > > > > > > > > > From: alex chen > > > > > > > > > > > > > > commit 28f5a8a7c033cbf3e32277f4cc9c6afd74f05300 upstream. > > > > > > > > > > > > > > we should wait dio requests to finish before inode lock in > > > > > > > ocfs2_setattr(), otherwise the following deadlock will > > > > > > > happen: > > > > > > > > > > > > [...] > > > > > > > > > > > > I looked at the kernel-doc for inode_dio_wait(): > > > > > > > > > > > > /** > > > > > >  * inode_dio_wait - wait for outstanding DIO requests to finish > > > > > >  * @inode: inode to wait for > > > > > >  * > > > > > >  * Waits for all pending direct I/O requests to finish so that we can > > > > > >  * proceed with a truncate or equivalent operation. > > > > > >  * > > > > > >  * Must be called under a lock that serializes taking new references > > > > > >  * to i_dio_count, usually by inode->i_mutex. > > > > > >  */ > > > > > > > > > > > > Now that ocfs2_setattr() calls this outside of the inode locked region, > > > > > > what prevents another task adding a new dio request immediately > > > > > > afterward? > > > > > > > > > > > > > > > > In the kernel 4.6, firstly, we use the inode_lock() in do_truncate() to > > > > > prevent another bio to be issued from this node. > > > > > > > > [...] > > > > > > > > Yes but there seems to be a race condition - after the call to > > > > inode_dio_wait() and before the call to inode_lock(), another dio > > > > request can be added. > > > > Sorry, I've been mixing up inode_lock() and ocfs2_inode_lock().  > > However: > > > > > In the truncating file situation, the lock order is as follow: > > > do_truncate() > > >  inode_lock() > > >  notify_change() > > >   ocfs2_setattr() > > >    inode_dio_wait() > > >     --here it is under the protect of inode_lock(), so another dio requests > > >       from another process will not be added. > > > > only DIO reads seem to take the inode lock. > > > > I do not clearly understand what you mean. > The inode_lock() will be called in ocfs2_file_write_iter(). Oh I see. I didn't realise that was part of the call chain. > You mean only DIO writes seem to take the inode_lock()? I did mean reads, as do_blockdev_direct_IO() may call inode_lock() for reads - but ocfs2 doesn't set the flag for that. Maybe that's OK? > BTW, in this patch, I just adjusted the inode_dio_wait() to the front of the ocfs2_rw_lock() > and didn't adjust the order of inode_lock() and inode_dio_wait(). Right. I think you've convinced me to stop worrying about this. Ben. -- Ben Hutchings Software Developer, Codethink Ltd.