Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp2762679pxk; Sun, 20 Sep 2020 16:47:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyaX8dGr8WFIMk46/iKLAVc4/fuFl+uhGgn1xI50weqTkMaaArnP8rYXm6tqdLYAweqUJYX X-Received: by 2002:a17:906:60d3:: with SMTP id f19mr48842065ejk.141.1600645668139; Sun, 20 Sep 2020 16:47:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600645668; cv=none; d=google.com; s=arc-20160816; b=BA5NsUN3k+pGBgU7LL9FJGU5p2l/xz5CjReVhtQ9sO+wHgRW3+cO7u7yw9zsUXljU/ aqGw9K/24JrI0b5Q652+Pls4CHYpyhXT16qLFS/G4/zFDYvbcAt0W6QJ/1TLFZpOpcwT tVrDRdXYDq6/mgLaaMlE82LIHzPUb/6fYG8zujNehpXxWCQs962++T1v94frWMk2z9jt PXIK9i4I/fMC6d0KSZ4mWVZ0NV+OfyoMWp9xG7cTV37CoNxqCBqtzO7bDRa2T9MW0GYY z6Nrco8W8gQPaQfClLG9He9KQFaNDjK3AWS7OCuHxgWUxbhd1uXQZ7atYQ6sQPEZtQxd mjBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=07kxvn0Ml6esmkMXd1zwWlUk3Jc4QOnPDRXPkbInAhU=; b=AgR0rmr2L8VKPJ/j6g1vQqJQTIgcBjvx8j7BFJXLrHZc4aLhphx7OwA50h6P/bCzZe AioXi4zH/CqXxKw7kZGfo9ViOVaZGdrx8qLGsZlv5DVDqkx6inLFQmtcN9Tt2sinxSwU cXbmA7san3YksUjY6mTwX0vjtmyUTRTfO/B6xy5SL3zNcCk3r8cI2NfrFIRXjTx+EKvv JPj9afveqoiztICQDZ78Ibi3lYPgoWLtTBplzBik0eWtCgKHXEq42uPuFwsjiEl50UJC OZxfX47Ku5XGATgONiHIstyW4SbgSUXnLPgEwKQRZX4FyfPWkWum8b7IfWW4tKe8Bkt1 a+bg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dn14si6933352ejc.236.2020.09.20.16.47.25; Sun, 20 Sep 2020 16:47:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726322AbgITXlR (ORCPT + 99 others); Sun, 20 Sep 2020 19:41:17 -0400 Received: from mail104.syd.optusnet.com.au ([211.29.132.246]:45020 "EHLO mail104.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726156AbgITXlQ (ORCPT ); Sun, 20 Sep 2020 19:41:16 -0400 Received: from dread.disaster.area (pa49-195-191-192.pa.nsw.optusnet.com.au [49.195.191.192]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id CCBD682561D; Mon, 21 Sep 2020 09:41:04 +1000 (AEST) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1kK8x5-0005ki-Rl; Mon, 21 Sep 2020 09:41:03 +1000 Date: Mon, 21 Sep 2020 09:41:03 +1000 From: Dave Chinner To: Jan Kara Cc: Mikulas Patocka , Dan Williams , Linus Torvalds , Alexander Viro , Andrew Morton , Matthew Wilcox , Eric Sandeen , Dave Chinner , Linux Kernel Mailing List , linux-fsdevel Subject: Re: the "read" syscall sees partial effects of the "write" syscall Message-ID: <20200920234103.GX12096@dread.disaster.area> References: <20200918131317.GH18920@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200918131317.GH18920@quack2.suse.cz> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.3 cv=XJ9OtjpE c=1 sm=1 tr=0 cx=a_idp_d a=vvDRHhr1aDYKXl+H6jx2TA==:117 a=vvDRHhr1aDYKXl+H6jx2TA==:17 a=kj9zAlcOel0A:10 a=reM5J-MqmosA:10 a=uZvujYp8AAAA:8 a=7-415B0cAAAA:8 a=IhRoebVIs0TbUmjakS8A:9 a=CjuIK1q_8ugA:10 a=nv2HPNHG-XcA:10 a=SLzB8X_8jTLwj6mN0q5r:22 a=biEYGPWJfzWAr4FL6Ov7:22 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 18, 2020 at 03:13:17PM +0200, Jan Kara wrote: > On Fri 18-09-20 08:25:28, Mikulas Patocka wrote: > > I'd like to ask about this problem: when we write to a file, the kernel > > takes the write inode lock. When we read from a file, no lock is taken - > > thus the read syscall can read data that are halfway modified by the write > > syscall. > > > > The standard specifies the effects of the write syscall are atomic - see > > this: > > https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_07 > > Yes, but no Linux filesystem (except for XFS AFAIK) follows the POSIX spec > in this regard. Mostly because the mixed read-write performance sucks when > you follow it (not that it would absolutely have to suck - you can use > clever locking with range locks but nobody does it currently). In practice, > the read-write atomicity works on Linux only on per-page basis for buffered > IO. We come across this from time to time with POSIX compliant applications being ported from other Unixes that rely on a write from one thread being seen atomically from a read from another thread. There are quite a few custom enterprise apps around that rely on this POSIX behaviour, especially stuff that has come from different Unixes that actually provided Posix compliant behaviour. IOWs, from an upstream POV, POSIX atomic write behaviour doesn't matter very much. From an enterprise distro POV it's often a different story.... Cheers, Dave. -- Dave Chinner david@fromorbit.com