Return-Path: Received: from aserp1040.oracle.com ([141.146.126.69]:45499 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932596AbbIDTIi convert rfc822-to-8bit (ORCPT ); Fri, 4 Sep 2015 15:08:38 -0400 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: async CLOSE creates a race with a DELEGRETURN followed by a REMOVE From: Chuck Lever In-Reply-To: Date: Fri, 4 Sep 2015 15:08:30 -0400 Cc: Trond Myklebust , Linux NFS Mailing List Message-Id: References: <5163FEAE-9DBF-46BE-AC7B-E276069CE893@oracle.com> To: Olga Kornievskaia Sender: linux-nfs-owner@vger.kernel.org List-ID: On Sep 4, 2015, at 2:09 PM, Olga Kornievskaia wrote: > On Thu, Sep 3, 2015 at 8:31 PM, Chuck Lever wrote: >> >> On Sep 3, 2015, at 8:18 PM, Olga Kornievskaia wrote: >> >>> On Thu, Sep 3, 2015 at 6:37 PM, Trond Myklebust >>> wrote: >>>> >>>> On Sep 3, 2015 18:08, "Olga Kornievskaia" wrote: >>>>> >>>>> On Thu, Sep 3, 2015 at 3:34 PM, Chuck Lever >>>>> wrote: >>>>>> >>>>>> On Sep 3, 2015, at 3:26 PM, Olga Kornievskaia wrote: >>>>>> >>>>>>> When file is opened with O_DIRECT, sending asynchronous CLOSE creates >>>>>>> a race condition between CLOSE and DELEGRETURN followed REMOVE. Next >>>>>>> operations are sent before the reply for CLOSE comes back which is >>>>>>> according to the spec is not allowed (section 10.4.4). Server can get >>>>>>> a REMOVE before the CLOSE and it causes EACCESS error to be returned >>>>>>> because server thinks there is open state left. >>>>>>> >>>>>>> According to the spec: "...whenever a client chooses to return a >>>>>>> delegation voluntarily. The following items of state need to be dealt >>>>>>> with: >>>>>>> >>>>>>> o If the file associated with the delegation is no longer open and >>>>>>> no previous CLOSE operation has been sent to the server, a CLOSE >>>>>>> operation must be sent to the server." >>>>>>> >>>>>>> So i'm not sure if it will be argued that it doesn't say that a reply >>>>>>> must be received before sending the DELEGRETURN. However, if it's not >>>>>>> mandated then a race as I'm mentioning occurs. >>>>>>> >>>>>>> The following patch introduces making close async: >>>>>>> commit f895c53f8ace3c3e49ebf9def90e63fc6d46d2bf >>>>>>> Author: Chuck Lever >>>>>>> Date: Mon Feb 1 14:17:50 2010 -0500 >>>>>>> >>>>>>> NFS: Make close(2) asynchronous when closing NFS O_DIRECT files >>>>>>> >>>>>>> For NFSv2 and v3: >>>>>>> >>>>>>> O_DIRECT writes are always synchronous, and aren't cached, so >>>>>>> nothing >>>>>>> should be flushed when closing an NFS O_DIRECT file descriptor. >>>>>>> Thus >>>>>>> there are no write errors to report on close(2). >>>>>>> >>>>>>> In addition, there's no cached data to verify on the next open(2), >>>>>>> so we don't need clean GETATTR results at close time to compare >>>>>>> with. >>>>>>> >>>>>>> Thus, there's no need for the nfs_revalidate_inode() call when >>>>>>> closing >>>>>>> an NFS O_DIRECT file. This reduces the number of synchronous >>>>>>> on-the-wire requests for a simple open-write-close of an NFS >>>>>>> O_DIRECT >>>>>>> file by roughly 20%. >>>>>>> >>>>>>> For NFSv4: >>>>>>> >>>>>>> Call nfs4_do_close() with wait set to zero when closing an NFS >>>>>>> O_DIRECT file. The CLOSE will go on the wire, but the application >>>>>>> won't wait for it to complete. >>>>>>> >>>>>>> I'd like to suggest to revert the use of this patch. >>>>>> >>>>>> DELEGRETURN can be made to wait for the CLOSE reply. >>>>>> >>>>>> Or, maybe the client should return an offered delegation >>>>>> immediately when a file is open O_DIRECT. I don't see much >>>>>> use in keeping the delegation if the application wants >>>>>> GETATTR, READ and WRITE always flushed to the server >>>>>> immediately. >>>>> >>>>> Yes i think somehow a delegreturn should be made to wait for close. >>>>> Because this race also happens for other opens when close is sent >>>>> async but I haven't pinned pointed the case. It looks like async write >>>>> calls async close so perhaps that's it. >>>>> >>>>> Though how much are we saving by making CLOSE async vs making the code >>>>> more complicated? >>>>> >>>> I'd far prefer reverting the asynchronous close if it is causing problems. >>>> Serialisation is harder to get right. Particularly so when we need to do it >>>> from within the context of the state manager thread.... >>>> >>> >>> Agreed. >>> >>> Also I'd like to point out that coordination with CLOSE needs to be >>> done not only with DELEGRETURN but with REMOVE as well. In the current >>> kernel, if open is done with O_DIRECT, then the code sets >>> WANT_NO_DELEGATION flag and server does not give one. However, the >>> close being asynchronous if followed by a REMOVE will cause EACCESS >>> error if CLOSE arrives after the REMOVE. >> >> NFS4ERR_ACCESS is allowed for REMOVE, but why would a server >> return ACCESS in this case? > > Because client has open state left. NFS4ERR_ACCESS does not mean "there is open state left." I think the correct status should be NFS4ERR_FILE_OPEN, but it's optional whether a server can remove an open file. See RFC 7530 Section 13.1.4.5. ACCESS is plausible if the client had opened the file with a deny mode, but I didn't think Linux did that. >>> I have a test case and setup where I delay the CLOSE via proxy and I >>> have an open with O_DIRECT. I get no delegation and after I do close() >>> and remove(), the latter fails. >> >> Can this be fixed without causing a performance regression >> for NFSv3, which is not afflicted by any of these ordering >> requirements? > > Since there is no CLOSE in NFSv3 why would it cause performance regression? NFSv3 does a synchronous GETATTR during close(2). The commit you want to revert, in addition to altering NFSv4 CLOSE behavior, eliminates that synchronous GETATTR when the file is open with O_DIRECT. Thus, if you revert that commit, it will result in a performance regression for NFSv3. -- Chuck Lever