Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp3188762ybk; Mon, 18 May 2020 21:02:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzyAP9lafY2XGne3UzyEsRPwY1A21A5fGetoTNFvC2G82JMtaTRBO/y5PvM+gorELv63wiw X-Received: by 2002:a50:cdc8:: with SMTP id h8mr16057721edj.26.1589860946299; Mon, 18 May 2020 21:02:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589860946; cv=none; d=google.com; s=arc-20160816; b=ybSA2MJgE0XDKJDWa6wnFBxmzfJb2yXmQqYISBZCy+M2U9p10ANf34eVn6rLFvsp48 DSmaTw4y/+Tk8worddiqmeKKRboOBAupNxFOYqKovGUp353uImDEd/G+le4OLV8f4sH1 TGdB1jnDcuzMwku+QBhDOUBKzOOu95Y4DjXnz1+TxFLPvZFZrop8IPFNJ9fChWEnADx1 +YOwYR+ZwtbHQojLXqR7HKb97HmWGSlpUt4sho1ADeEjOXbwGKs1RzA00PHRT7h60j1V Fzeg9IeKm0c2tp3xnIX9er6cS021P6YjW1D1/Qv2WrLE2sEa2HTJWqhPeh1cpnS8ZYiM Hs6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=YMISo78hL6Pp1LYoLknIURuBMh7SpGTi20rRpTAn94g=; b=oAD1KsyFQbiUqFDQH7EB9JnXNIMhryWS92HEIWsfVY87frN0ome8bZIowT3qGjqF7F 5ffhdBPfV7XfWNVfIlkadREMtQmePL5cC4mmUhNg1hSQ3ehO2KixjpZxyHrad11A6N5i /keBrCct0shSZaU038Nc9ukdN71TAsKgrX4A1ok/GWaNFWLxxfIee86W1/4iUcphr4Vo s9QDq7S6QrJDy9orJ/ADxg0JbD+K4tRFfPjqcTeaS1hW7n50o1XVKjPUbE2ZtAyFo6wW 7fYnyX3eGxNNWSybyOR95LX4V+xkhh3JlL+Ap3YgvCNC+56a16KZ9M5VdBduAGeMyoN7 6qmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=B8esUJBa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id lc10si7919248ejb.156.2020.05.18.21.02.02; Mon, 18 May 2020 21:02:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=B8esUJBa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726384AbgESEAn (ORCPT + 99 others); Tue, 19 May 2020 00:00:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725791AbgESEAn (ORCPT ); Tue, 19 May 2020 00:00:43 -0400 Received: from mail-il1-x141.google.com (mail-il1-x141.google.com [IPv6:2607:f8b0:4864:20::141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F39AEC061A0C; Mon, 18 May 2020 21:00:42 -0700 (PDT) Received: by mail-il1-x141.google.com with SMTP id c20so9924282ilk.6; Mon, 18 May 2020 21:00:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YMISo78hL6Pp1LYoLknIURuBMh7SpGTi20rRpTAn94g=; b=B8esUJBa6fAkrFrLfM4cuWFkuP0LC1On+oDVYNt1UfHOSu+8+ucq7iH2Iw2DyT6bRJ aoby3Pok+cG0JUCh/fFekS+fompz5VivbO0VbAv7bnCLxsow4BmlwlrwJ83q0Tc1RFwg F0pZPDKiuMIvAUpulSYJA916RL8f+LPozWN7u1E/xsA1rFtKprPFY0gTlu0qRdQ1ccjh pp1mVvAP73KzNeDTSq3ktwVdegHNTbDIuBdRW1g3SOSGSMSDWMRlsKXqkHlhKXbdqUkR 99YAre5ZwH2V5EZAs2Yu6rfrAh/ROGlDufH9JJ4zzofTwl99dDQAvfoeISPcfO53x2hS 0dow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YMISo78hL6Pp1LYoLknIURuBMh7SpGTi20rRpTAn94g=; b=ENjsf2So1dcwE4gjbIwD/B69quP5nSOPy77vjNKj2LLKgLCGvl0lqYrnvlE5Xrlu4T JBSYDskJWEw1qjeqlAvTHikTF4nadzPTc1PwIi/5Z2zkTKqGH1VpVWOeRn2o610r+Z8m nb3Bzxk66WVtf1+CJKRoh6YVHO8hX23XSMFtyCChHfems2Yprii3OYiJXoVrLXHwFcE8 U54SHeU+he5qmaGGKgu3tchsRAeCqvAXKTLPC+2hiXXELq1AtRFXqvmjlMflhLu8s6Sk vXy3mqThOhQiGfusl/j9STVgpmZR1kEHOf1vDDUxCHBGl+BwrRZL+05Q9gENchO72CKz FHNg== X-Gm-Message-State: AOAM533lMS9ehTTfMrYfyZud8eTcKXXdn+JdvHtoTgi8y7tHalJwNtX6 NDkmevFduwjqFjS6JnndQtJeeArCjGxF1NS/KuA= X-Received: by 2002:a92:9e11:: with SMTP id q17mr20295545ili.137.1589860842312; Mon, 18 May 2020 21:00:42 -0700 (PDT) MIME-Version: 1.0 References: <20200514111453.GA99187@suse.com> <8497fe9a11ac1837813ee5f14b6ebae8fa6bf707.camel@kernel.org> <20200514124845.GA12559@suse.com> <4e5bf0e3bf055e53a342b19d168f6cf441781973.camel@kernel.org> <20200515111548.GA54598@suse.com> <61b1f19edcc349641b5383c2ac70cbf9a15ba4bd.camel@kernel.org> In-Reply-To: From: Amir Goldstein Date: Tue, 19 May 2020 07:00:31 +0300 Message-ID: Subject: Re: [PATCH] ceph: don't return -ESTALE if there's still an open file To: Gregory Farnum Cc: Jeff Layton , Luis Henriques , Ilya Dryomov , ceph-devel , linux-kernel , fstests , Dave Chinner , Christoph Hellwig , Miklos Szeredi Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 19, 2020 at 1:30 AM Gregory Farnum wrote: > > Maybe we resolved this conversation; I can't quite tell... I think v2 patch wraps it up... [...] > > > > > > Questions: > > > 1. Does sync() result in fully purging inodes on MDS? > > > > I don't think so, but again, that code is not trivial to follow. I do > > know that the MDS keeps around a "strays directory" which contains > > unlinked inodes that are lazily cleaned up. My suspicion is that it's > > satisfying lookups out of this cache as well. > > > > Which may be fine...the MDS is not required to be POSIX compliant after > > all. Only the fs drivers are. > > I don't think this is quite that simple. Yes, the MDS is certainly > giving back stray inodes in response to a lookup-by-ino request. But > that's for a specific purpose: we need to be able to give back caps on > unlinked-but-open files. For NFS specifically, I don't know what the > rules are on NFS file handles and unlinked files, but the Ceph MDS > won't know when files are closed everywhere, and it translates from > NFS fh to Ceph inode using that lookup-by-ino functionality. > There is no protocol rule that NFS server MUST return ESTALE for file handle of a deleted file, but there is a rule that it MAY return ESTALE for deleted file. For example, on server restart and traditional block filesystem, there is not much choice. So returning ESTALE when file is deleted but opened on another ceph client is definitely allowed by the protocol standard, the question is whether changing the behavior will break any existing workloads... > > > > > 2. Is i_nlink synchronized among nodes on deferred delete? > > > IWO, can inode come back from the dead on client if another node > > > has linked it before i_nlink 0 was observed? > > > > No, that shouldn't happen. The caps mechanism should ensure that it > > can't be observed by other clients until after the change. > > > > That said, Luis' current patch doesn't ensure we have the correct caps > > to check the i_nlink. We may need to add that in before we can roll with > > this. > > > > > 3. Can an NFS client be "migrated" from one ceph node to another > > > with an open but unlinked file? > > > > > > > No. Open files in ceph are generally per-client. You can't pass around a > > fd (or equivalent). > > But the NFS file handles I think do work across clients, right? > Maybe they can, but that would be like NFS server restart, so all bets are off w.r.t open but deleted files. Thanks, Amir.