Return-Path: linux-nfs-owner@vger.kernel.org Received: from mail-vc0-f181.google.com ([209.85.220.181]:44892 "EHLO mail-vc0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752853AbaH2PvH (ORCPT ); Fri, 29 Aug 2014 11:51:07 -0400 Received: by mail-vc0-f181.google.com with SMTP id ij19so2661594vcb.40 for ; Fri, 29 Aug 2014 08:51:05 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <1438879608.14503585.1407251825405.JavaMail.zimbra@redhat.com> <1408580933.4029.2.camel@leira.trondhjem.org> <53F612DC.60006@redhat.com> <53F63758.70306@redhat.com> Date: Fri, 29 Aug 2014 11:51:05 -0400 Message-ID: Subject: Re: [PATCH] nfs: Always try and release an NFS file lock, even after receiving a SIGKILL From: Trond Myklebust To: Benjamin Coddington Cc: David Jeffery , Linux NFS Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-nfs-owner@vger.kernel.org List-ID: On Fri, Aug 29, 2014 at 11:34 AM, Benjamin Coddington wrote: > > On Thu, 21 Aug 2014, Trond Myklebust wrote: >> >> On Thu, Aug 21, 2014 at 2:15 PM, David Jeffery >> wrote: >>> >>> On 08/21/2014 11:50 AM, Trond Myklebust wrote: >>>> >>>> On Thu, Aug 21, 2014 at 11:40 AM, David Jeffery >>>> wrote: >>>>> >>>>> On 08/20/2014 08:28 PM, Trond Myklebust wrote: >>>>>> >>>>>> >>>>>> What guarantees that this does not lead to silent corruption of the >>>>>> file >>>>>> if there are outstanding write requests? >>>>>> >>>>> >>>>> Do you have a particular scenario in mind you are concerned about? >>>>> >>>>> Right before the code the patch modifies, nfs_sync_mapping() is called. >>>>> Any writes started before the unlock operation began have already been >>>>> flushed, so we shouldn't have a corruption case of writes from before >>>>> the unlock began being sent after the unlock is complete. >>>>> >>>>> Are you concerned about some other nfs4 writes being started while we >>>>> initially waited on the counter? Such a write racing with the unlock >>>> >>>> >>>> No. I'm worried about the writes that have been started, but which are >>>> now completing in the background while the lock is being freed. >>>> >>>>> going ahead instead of erroring out could initially fail from a wrong >>>>> state ID, but should retry with the new state. Is there something I've >>>>> overlooked? >>>> >>>> >>>> Loss of lock atomicity due to the fact that the writes are completing >>>> after the lock was released. >>>> >>> >>> I don't think my patches break lock atomicity, unless I've completely >>> misunderstood nfs_sync_mapping()'s behavior. >>> >>> The background writes should have already been waited for by >>> nfs_sync_mapping() at the beginning of do_unlk(). nfs_sync_mapping() >>> would call nfs_wb_all() which calls sync_inode() with WB_SYNC_ALL, so >>> it's going to push out any dirty data and wait for any writes to >>> complete. Only after the writes are complete do we go on to call >>> nfs_iocounter_wait(). >> >> >> What if nfs_wb_all() is interrupted? Currently, that doesn't matter >> precisely because of the call to nfs_iocounter_wait. > > > Hi Trond, it looks like the intent of: > > 577b423 NFS: Add functionality to allow waiting on all outstanding reads.. > 7a8203d NFS: Ensure that NFS file unlock waits for readahead to complete.. > > was to wait for readahead, not to avoid releasing the lock if nfs_wb_all() > is interrupted. What was the underlying reason for waiting for readahead to > complete? I know there was one issue where the NFS server would release the > lockowner before processing the reads - was that what these were trying to > avoid? I ask because if we're only worried about outstanding writes we > could just wait on the writes, not the reads. > > Would it be better to use TASK_UNINTERRUPTIBLE instead of TASK_KILLABLE in > __nfs_iocounter_wait? > > Ben Hi Ben, The main intent of the two patches was to ensure that all I/O that uses the lock stateid should have completed before we attempt to release the lock. The main reason is to avoid an unnecessary "recovery" situation where the server asks us to resend the I/O because the lock stateid is no longer valid. Concerning writes; if there are any outstanding then we must ensure those complete before we unlock the file. This rule should be unaffected by the above 2 patches (although it is just a bonus if they help to enforce it). Cheers Trond -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com