Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp3107806imm; Mon, 24 Sep 2018 16:10:14 -0700 (PDT) X-Google-Smtp-Source: ACcGV60Ypf5wQyFPwXCU38mees9Lj614WrGhtM9ecKXUQgQsn1hMWFcpIklOx98NXGkOH62m99x3 X-Received: by 2002:a63:350f:: with SMTP id c15-v6mr767510pga.206.1537830614716; Mon, 24 Sep 2018 16:10:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537830614; cv=none; d=google.com; s=arc-20160816; b=yp4TCMVxBljhUzfc3KZYJ5xfgFUMqdCc+lxfnMwCnCHQb+gq/uQ9BZz6pE1tR9BBPW WS74caBvTa3TFn5OwNEisiAIILP8SazTh+9ohl5fAfnIeImHXdlBrdSj3/T5Sc1pzPGp aRr6SkCJ1Md3hxJRP2Lkzs9HdwK+zeywHDfOFUAauASjHWlXzVpK21sMZjYoJfANCx79 cz3mxSQwIQWEIlUCNM8zUHB6gM46Kd+qv/3jYirld2g2HjQYabxT1GnUfgegno/6isjD /nYWpnX5djk+wJPg9K4hcjCwig6GsW7uzZr8Ti77PU2gdfVf9Nq9qyDAWshRssESdzlO HrmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=0FIFHhkyTTTnh++D+/Y9yd62O4ZQR+FoBwqa/YKIi4E=; b=NB15R5RsPy/AKfq0FCfvPsxAlqTpL3evhfRkuP/6jcqRULqKUTFhMwZRT4E8IX6BSY 3vfQmwIhiXwz9ZJkKfHZZvBEAA58ftQJUoF6+D7XMyjVVOZfN8pgYTatl0Z5CqLJ44T6 MGT7vNSgKU6/NsY1TDmV8IrwAkmykIpWUQomUXONCa0zW9DuGNO5mEy2cSn7WNwmBacq J7ZkhbVuIvVhb6XyldgvxLR1+PAUI/J77/nO9p00bxQGBo8bTefWShyWSBaXVr+2/qdd FNRNQjtNXrI2E/xgAQCWsymRlh9FOakc2EkbY4rFobJrzxoA2ID79E9tpuli5/gUzJAM rxSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l30-v6si198915plg.179.2018.09.24.16.09.58; Mon, 24 Sep 2018 16:10:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727265AbeIYFOP (ORCPT + 99 others); Tue, 25 Sep 2018 01:14:15 -0400 Received: from www.llwyncelyn.cymru ([82.70.14.225]:41066 "EHLO fuzix.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726327AbeIYFOP (ORCPT ); Tue, 25 Sep 2018 01:14:15 -0400 Received: from alans-desktop (82-70-14-226.dsl.in-addr.zen.co.uk [82.70.14.226]) by fuzix.org (8.15.2/8.15.2) with ESMTP id w8ON9Vwb018026; Tue, 25 Sep 2018 00:09:31 +0100 Date: Tue, 25 Sep 2018 00:09:30 +0100 From: Alan Cox To: Rogier Wolff Cc: Dave Chinner , Jeff Layton , =?UTF-8?B?54Sm5pmT5Yas?= , bfields@fieldses.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: POSIX violation by writeback error Message-ID: <20180925000930.3d4a93fd@alans-desktop> In-Reply-To: <20180906091718.GL24519@BitWizard.nl> References: <20180904161203.GD17478@fieldses.org> <20180904162348.GN17123@BitWizard.nl> <20180904185411.GA22166@fieldses.org> <09ba078797a1327713e5c2d3111641246451c06e.camel@redhat.com> <20180905120745.GP17123@BitWizard.nl> <20180906025709.GZ5631@dastard> <20180906091718.GL24519@BitWizard.nl> Organization: Intel Corporation X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 6 Sep 2018 11:17:18 +0200 Rogier Wolff wrote: > On Thu, Sep 06, 2018 at 12:57:09PM +1000, Dave Chinner wrote: > > On Wed, Sep 05, 2018 at 02:07:46PM +0200, Rogier Wolff wrote: > > > > And this has worked for years because > > > the kernel caches stuff from inodes and data-blocks. If you suddenly > > > write stuff to harddisk at 10ms for each seek between inode area and > > > data-area.. > > > > You're assuming an awful lot about filesystem implementation here. > > Neither ext4, btrfs or XFS issue physical IO like this when flushing > > data. > > My thinking is: When fsync (implicit or explicit) needs to know > the result of the underlying IO, it needs to wait for it to have > happened. Worse than that. In many cases it needs to wait for the I/O command to have been accepted and confirmed by the drive, then tell the disk to do a commit to physical media, then see if that blows up. A confirmation the disk got the data is not a confirmation that it's stable. Your disk can also reply from its internal cache with data that will fail to hit the media a few seconds later. Given a cache flush on an ATA disk can take 7 seconds I'm not fond of it 8) Fortunately spinning rust is on the way out. It's even uglier in truth. Spinning rust rewrites sectors under you by magic without your knowledge and in freaky cases you can have data turn error that you've not even touched this month. Flash has some similar behaviour although it can at least use a supercap to do real work. You can also issue things like a single 16K write and have only the last 8K succeed and the drive report an error, which freaks out some supposedly robust techniques. Alan