Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp39649imm; Tue, 25 Sep 2018 15:32:13 -0700 (PDT) X-Google-Smtp-Source: ACcGV61SPrUGKm1Nmt8E0Oi+5qoAM57WPiJ/7ZsUQznoWwcMi9sJlyg0MbxZyLmhfO/ped9LTDgO X-Received: by 2002:a63:91:: with SMTP id 139-v6mr2879396pga.389.1537914733455; Tue, 25 Sep 2018 15:32:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537914733; cv=none; d=google.com; s=arc-20160816; b=iuaYhD6PZeOTLOuIiRQvI2GA5tAKxWEHClmEEQrIX+a/pHjZeNBs9cXMrTfKUCWeTX sOXjXfa1eMQVlnc+zbvDFz3DZQ1lWvEh/oG7oC0fFhCJFE9G1d3SHjtAzVS+2K3wKG8J iwiu2YdE7sZKaxhlqiVJGbFy5yFdZSHCFhk89tI6LAo/+0Psvuh52y9zKcgT3Wdoyz5i uKtH8+Usp012MI+zHrS/WtuUchzeOxoggqRB8FEifbBBV4lEZ2xlyonDIg2l9O40h3tu VwJ1EhYDmiW4gfF0jnnOEPuDkcR5rgBMtxMT16QoXs5FmCriNOAL7jFtE8yBu1nHhVSY qtqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:mail-followup-to :message-id:subject:cc:to:from:date:dkim-signature; bh=BJ+VKPWU9gRo959QUFZFDBXCPxV8QYw4ibR8tiiazdw=; b=Mx1tl+HUzTLTBxB498Lr2EPtVcgrVV/FKKe3o8Wotzh2L+D9fod0P//sqIB5h3o4Cn 2GvoHSc90LOJbSrA7d8//yN0p2mKvNmGcOYUxo1/BAzOoFV3REKm43MREkNQKDPPhHsd j3Tnb6iZhr8+vUABUu4FIgBOaDLeDMZNrXKdWuYymf84ocwJtziHoxKaGpKC3RH5pGVE w+PjhtjpwDqgcV/CnSPaNJNovVUImfPpY1sww/mMSh0xYevrgMLHMdtWsXqKZqfk+riN SLDGOvg+QuvvfYctKw+LoZ3GpDYUKSvPVBu8B6bwaYbaN+QgjLBs0T/CFp9Zg+b6wpuB r/EA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@thunk.org header.s=ef5046eb header.b="o0uvAc/4"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h17-v6si3424327pgg.218.2018.09.25.15.31.56; Tue, 25 Sep 2018 15:32:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@thunk.org header.s=ef5046eb header.b="o0uvAc/4"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726342AbeIZEkr (ORCPT + 99 others); Wed, 26 Sep 2018 00:40:47 -0400 Received: from imap.thunk.org ([74.207.234.97]:47170 "EHLO imap.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725736AbeIZEkr (ORCPT ); Wed, 26 Sep 2018 00:40:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=thunk.org; s=ef5046eb; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=BJ+VKPWU9gRo959QUFZFDBXCPxV8QYw4ibR8tiiazdw=; b=o0uvAc/4vzqbm9jNMIrJLlubCA AUb9UZlAzxHQWCuQ/2uHtD5t6k/4LxFd4x4djD4ecMtxgU5S7RXRlZ4rHnThm5nqVoH4KcZpMkNO1 3HNib33I6Jkw7sPVH0H5CmJUlpX2mSNUfh4QI8frQuJk2D3Ene4ojA1hpTou4VJjgVUI=; Received: from root (helo=callcc.thunk.org) by imap.thunk.org with local-esmtp (Exim 4.89) (envelope-from ) id 1g4vr5-00046D-Ao; Tue, 25 Sep 2018 22:30:55 +0000 Received: by callcc.thunk.org (Postfix, from userid 15806) id 6671E7A54D5; Tue, 25 Sep 2018 18:30:54 -0400 (EDT) Date: Tue, 25 Sep 2018 18:30:54 -0400 From: "Theodore Y. Ts'o" To: Jeff Layton Cc: Alan Cox , =?utf-8?B?54Sm5pmT5Yas?= , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Rogier Wolff Subject: Re: POSIX violation by writeback error Message-ID: <20180925223054.GH2933@thunk.org> Mail-Followup-To: "Theodore Y. Ts'o" , Jeff Layton , Alan Cox , =?utf-8?B?54Sm5pmT5Yas?= , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Rogier Wolff References: <486f6105fd4076c1af67dae7fdfe6826019f7ff4.camel@redhat.com> <20180925003044.239531c7@alans-desktop> <0662a4c5d2e164d651a6a116d06da380f317100f.camel@redhat.com> <20180925154627.GC2933@thunk.org> <23cd68a665d27216415dc79367ffc3bee1b60b86.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <23cd68a665d27216415dc79367ffc3bee1b60b86.camel@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on imap.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 25, 2018 at 12:41:18PM -0400, Jeff Layton wrote: > That's all well and good, but still doesn't quite solve the main concern > with all of this. It's suppose we have this series of events: > > open file r/w > write 1024 bytes to offset 0 > > read 1024 bytes from offset 0 > > Open, write and read are successful, and there was no fsync or close in > between them. Will that read reflect the result of the previous write or > no? If the background writeback hasn't happened, Posix requires that the read returns the result of the write. And the user doesn't know when or if the background writeback has happened unless the user calls fsync(2). Posix in general basically says anything is possible if the system fails or crashes, or is dropped into molten lava, etc. Do we say that Linux is not Posix compliant if a cosmic ray flips a few bits in the page cache? Hardly! The *only* time Posix makes any guarantees is if fsync(2) returns success. So the subject line, is in my opinion incorrect. The moment we are worrying about storage errors, and the user hasn't used fsync(2), Posix is no longer relevant for the purposes of the discussion. > The answer today is "it depends". And I think that's fine. The only way we can make any guarantees is if we do what Alan suggested, which is to imply that a read on a dirty page *block* until the the page is successfully written back. This would destroy performance. I know I wouldn't want to use such a system, and if someone were to propose it, I'd strongly argue for a switch to turn it *off*, and I suspect most system administators would turn it off once they saw what it did to system performance. (As a thought experiment, think about what it would do to kernel compiles. It means that before you link the .o files, you would have to block and wait for them to be written to disk so you could be sure the writeback would be successful. **Ugh**.) Given that many people would turn such a feature off once they saw what it does to their system performance, applications in general couldn't rely on it. which means applications who cared would have to do what they should have done all along. If it's precious data use fsync(2). If not, most of the time things are *fine* and it's not worth sacrificing performance for the corner cases unless it really is ultra-precious data and you are willing to pay the overhead. - Ted