Received: by 2002:a4a:301c:0:0:0:0:0 with SMTP id q28-v6csp1141157oof; Tue, 25 Sep 2018 08:47:02 -0700 (PDT) X-Google-Smtp-Source: ACcGV60goEr9g24iM2pQdv0/rwEz8Tr1KYfv/oI8fGd1XuUu+k3vi9+zh21PAK1qetRmOJHOE9qk X-Received: by 2002:a65:4585:: with SMTP id o5-v6mr1637664pgq.212.1537890422051; Tue, 25 Sep 2018 08:47:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537890422; cv=none; d=google.com; s=arc-20160816; b=1BjtREea9rGKQzy8a51bexAZXfNz5NrzgDKwRH0hxjWdHI7Ulfs3lr4Pr8UhLrNu4w 3MNX42pyTJcJahbgCW1d0dqi73wIjM27cxB3BSSEEdMxWI8DHChDdo/wuiNQHRhkph+8 7LHBreztoeTgjx9p8lNezMH7xDuBQL2bCCbdHhfDnkMKaRdHvtDK2EAYY5SOFPSm0gVE A2gF7ylx2R2ePC9/8+gnpeawFS1ReNRX43WAl4eEie7ZHQs4FCO3k10KrpBT6oSjMzAz YhVGAxfg71ej0T2QqAjTjk0RtOYkNg/5a0H/7EXK5JUWJ6nyiGQ86icv26gZeJLu1bmW v9jg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:mail-followup-to :message-id:subject:cc:to:from:date:dkim-signature; bh=3zzIQZzaHxit4oMS1I7zcmcJoowx3ZYw6fkvw/KRdxM=; b=sfPBnw5Ml2/3CNeA6TGcFDh86w/ZdBcs+JCUGam/ygkxxMIdrd8s1g7JMWe2WtwYzF oipOVgbD3MJ0yBFlrKnw3YaiDdHswiez1F/+yNaq3ivAndAn6qO5e02otfemlOJFdpLW UMpyf2tuEqC3Jtzvi5EPX194kCfEAI8LfFaMVCMGh07/LFCjov6gOpahWvx2kNu3MZwf XrS9eA0ZI51TVXFNCnPEU7eYxOkxZe4bPkGvk0XTnJsr3i5BcrisvQhil8TVw8TvtNoL SE1zsevJ8I7Nu6rCvArsGy5QTBmeeDxKOfqomy0dBIxuY21FyTXrRzOWj+lP4zNmgWTf T9jg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@thunk.org header.s=ef5046eb header.b=N8M3CJ6E; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l23-v6si2655573pgo.228.2018.09.25.08.46.45; Tue, 25 Sep 2018 08:47:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@thunk.org header.s=ef5046eb header.b=N8M3CJ6E; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729673AbeIYVyl (ORCPT + 99 others); Tue, 25 Sep 2018 17:54:41 -0400 Received: from imap.thunk.org ([74.207.234.97]:45596 "EHLO imap.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729306AbeIYVyl (ORCPT ); Tue, 25 Sep 2018 17:54:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=thunk.org; s=ef5046eb; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=3zzIQZzaHxit4oMS1I7zcmcJoowx3ZYw6fkvw/KRdxM=; b=N8M3CJ6ESU8yx4PZswc4AfFgGE 77Dtrq13IPukmbnZuLEqyE9oi3nQmdX8ga+uk0bJChnVwltO76yZ13o/ehpZc38YWQFBCLizXBskA 9PaN//SNfxE04Fy3AsRhTGjZM0NEmO8mcGKq5gJIu0ONesCjt4NH8QsUarvRdBIgf1u4=; Received: from root (helo=callcc.thunk.org) by imap.thunk.org with local-esmtp (Exim 4.89) (envelope-from ) id 1g4pXg-0000l3-J8; Tue, 25 Sep 2018 15:46:28 +0000 Received: by callcc.thunk.org (Postfix, from userid 15806) id C859F7A510B; Tue, 25 Sep 2018 11:46:27 -0400 (EDT) Date: Tue, 25 Sep 2018 11:46:27 -0400 From: "Theodore Y. Ts'o" To: Jeff Layton Cc: Alan Cox , =?utf-8?B?54Sm5pmT5Yas?= , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Rogier Wolff Subject: Re: POSIX violation by writeback error Message-ID: <20180925154627.GC2933@thunk.org> Mail-Followup-To: "Theodore Y. Ts'o" , Jeff Layton , Alan Cox , =?utf-8?B?54Sm5pmT5Yas?= , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Rogier Wolff References: <486f6105fd4076c1af67dae7fdfe6826019f7ff4.camel@redhat.com> <20180925003044.239531c7@alans-desktop> <0662a4c5d2e164d651a6a116d06da380f317100f.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0662a4c5d2e164d651a6a116d06da380f317100f.camel@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on imap.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 25, 2018 at 07:15:34AM -0400, Jeff Layton wrote: > Linux has dozens of filesystems and they all behave differently in this > regard. A catastrophic failure (paradoxically) makes things simpler for > the fs developer, but even on local filesystems isolated errors can > occur. It's also not just NFS -- what mostly started me down this road > was working on ENOSPC handling for CephFS. > > I think it'd be good to at least establish a "gold standard" for what > filesystems ought to do in this situation. We might not be able to > achieve that in all cases, but we could then document the exceptions. I'd argue the standard should be the precedent set by AFS and NFS. AFS verifies space available on close(2) and returns ENOSPC from the close(2) system call if space is not available. At MIT Project Athena, where we used AFS extensively in the late 80's and early 90's, we made and contributed back changes to avoid data loss as a result of quota errors. The best practice that should be documented for userspace is when writing precious files[1], programs should open for writing foo.new, write out the data, call fsync() and check the error return, call close() and check the error return, and then call rename(foo.new, foo) and check the error return. Writing a library function which does this, and which also copies the ACL's and xattr's from foo to foo.new before the rename() would probably help, but not as much as we might think. [1] That is, editors writing source files, but not compilers and similar programs writing object files and other generated files. None of this is really all that new. We had the same discussion back during the O_PONIES controversy, and we came out in the same place. - Ted P.S. One thought: it might be cool if there was some way for userspace applications to mark files with "nuke if not closed" flag, such that if the system crashes, the file systems would automatically unlink the file after a reboot or if the process was killed or exits without an explicit close(2). For networked/remote file systems that supported this flag, after the client comes back up after a reboot, it could notify the server that all files created previously from that client should be unlinked. Unlike O_TMPFILE, this would require file system changes to support, so maybe it's not worth having something which automatically cleans up files that were in the middle of being written at the time of a system crash. (Especially since you can get most of the functionality by using some naming convention for files that in the process of being written, and then teach some program that is regularly scanning the entire file system, such as updatedb(2) to nuke the files from a cron job. It won't be as efficient, but it would be much easier to implement.)