Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp142456iob; Wed, 27 Apr 2022 21:59:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxPWzfEftwYfectuzsdQlNgHTA0BTFy47I43+rUXGgF1dtWGb7o31ichdBKhviKN4ZAci3B X-Received: by 2002:a63:8bc4:0:b0:3ab:5ff6:c4fb with SMTP id j187-20020a638bc4000000b003ab5ff6c4fbmr13634380pge.375.1651121992470; Wed, 27 Apr 2022 21:59:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651121992; cv=none; d=google.com; s=arc-20160816; b=kGsuy/HCgQRD5KWuQjvUeq+J9inxr3qaT8Bz8If8vQFCv9sDnTkE42xMpf/JloztOJ sxk2RGs5o0Zed8x5bf2JjRrEV3TrWMMKq5eOOgWh2J0cCeSNgixOkMyG5LISwmdem/UI XCZzSs66noPwG4U8l3LCAVLkf1o+KqLzOE3EGdO9GAgE02D6gYUsUle2t6mf+2R9obnW hnPtOvmam+4Sfkjk8oHrLmPlWXkVTL8iYQOlvwisnR30x21VihRXUQ0D/4Tv3OK+mQuD Zntria5fOh4wWNPYjck1h61XOKUJZg2OUfWfYl5IwXREkfU0whjHV3omvCY/sDjJtHL2 IYjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=MVJapQpQ5E9fCd/8+tW7JYOKU2HdcGDq0gA7B9hFSRA=; b=Mkyc344IB0fOJK08o5EXrq1VXmIUSI8VJIzbb88SvztmVnO7Qvuj6k7fk3jSNhi5hM krRumWhNm9b8hPQqS55iLYDS5H9dxCoEsEyRPXffUE1B0xiPGhjaHUwjleLf2uxcvSze XXlQDpyw5vklSRKSbUijS43s8KDhg+CBRIQ59tZsexHL0H0FBAkSnQdgtzrAYbIunDef f3o6m4/RyOmHrQSDCGla85kErSj6P2wcgjRFwwM1pY92hlIEnrWQvjnRD7ZEUzaqqbIO G9JdOSRrV68UGDarMFhMlFVpG8iI/ac1VcCzeeSzQ6ulEOTS10WiN+Z4KdAT7aTZ1xKu QQ1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=SYq6Z5uH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y6-20020a17090322c600b0015d2cdf07dasi3463267plg.42.2022.04.27.21.59.36; Wed, 27 Apr 2022 21:59:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=SYq6Z5uH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238506AbiD1ADw (ORCPT + 99 others); Wed, 27 Apr 2022 20:03:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238600AbiD1ADt (ORCPT ); Wed, 27 Apr 2022 20:03:49 -0400 Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F20897DAB8 for ; Wed, 27 Apr 2022 17:00:36 -0700 (PDT) Received: by mail-lf1-x12d.google.com with SMTP id g19so5827665lfv.2 for ; Wed, 27 Apr 2022 17:00:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MVJapQpQ5E9fCd/8+tW7JYOKU2HdcGDq0gA7B9hFSRA=; b=SYq6Z5uHlCh0AR0o8aSWpGo+VVT8rVnt1AIJxBaVSZrb+W3FbwiioPwqo/+2BXuwcs svy1XXaEfGZacXJHICmwVBDBGbqCxHQnQ47fPborxvGIRWvs8wsh/tLoG+/GdVlWe7FW tEekfuwhxFJZfeghS0ot/yxZX6MI45eVvv7gY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MVJapQpQ5E9fCd/8+tW7JYOKU2HdcGDq0gA7B9hFSRA=; b=MUao141FTnytfgPYeIMlxIlfwxbv8g/7on809YXgza+xZUKxWmHg3uFCFpETRE5iGa hDt1Ha9Mda7p/0vlOLNeNTmv4jYh6M/Ar6n61VrzGDG5RV0B1ydJo+B6bu6ofHKeKVS+ kToobxiGS0BjTQp/jdFBlcGHRwz3gC8G4Vtls3sGqVZGdTqELDuI+j6Gl6FdqhCYpakL D4O/pWFKzL0I6COWxuEw2Csv6RuOQfz8vl1dw9FSnVoXZWlgX6PCkYaMItQPkahsW0XN OL4qs9i9LQ3Sy3a+3YEbjW6ioUZHglbw48hz7fev69taxu+k2unSsUIqnsaX7sPW8jAn ZTuA== X-Gm-Message-State: AOAM533bFUsxLi3gfM5Eu+thjL49eM3Fn1dg3IHgcfpQy4v3Kjh1e1u2 gXRarIrnPVfAyTHGvlbKQ8BfH1lv+YS7xrzB1Sc= X-Received: by 2002:a05:6512:3f89:b0:44a:f5bf:ec7e with SMTP id x9-20020a0565123f8900b0044af5bfec7emr22198406lfa.671.1651104034949; Wed, 27 Apr 2022 17:00:34 -0700 (PDT) Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com. [209.85.208.174]) by smtp.gmail.com with ESMTPSA id y21-20020a056512045500b00472053b2dcfsm1285020lfk.48.2022.04.27.17.00.33 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 27 Apr 2022 17:00:33 -0700 (PDT) Received: by mail-lj1-f174.google.com with SMTP id 17so4694369lji.1 for ; Wed, 27 Apr 2022 17:00:33 -0700 (PDT) X-Received: by 2002:a2e:8245:0:b0:24b:48b1:a1ab with SMTP id j5-20020a2e8245000000b0024b48b1a1abmr19472567ljh.152.1651104032758; Wed, 27 Apr 2022 17:00:32 -0700 (PDT) MIME-Version: 1.0 References: <20220426145445.2282274-1-agruenba@redhat.com> In-Reply-To: From: Linus Torvalds Date: Wed, 27 Apr 2022 17:00:16 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [GIT PULL] gfs2 fix To: Andreas Gruenbacher Cc: cluster-devel , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 27, 2022 at 3:20 PM Linus Torvalds wrote: > > So I really think > > (a) you are mis-reading the standard by attributing too strong logic > to paperwork that is English prose and not so exact > > (b) documenting Linux as not doing what you are mis-reading it for is > only encouraging others to mis-read it too > > The whole "arbitrary writes have to be all-or-nothing wrt all other > system calls" is simply not realistic, and has never been. Not just > not in Linux, but in *ANY* operating system that POSIX was meant to > describe. Side note: a lot of those "atomic" things in that documentation have come from a history of signal handling atomicity issues, and from all the issues people had with (a) user-space threading implementations and (b) emulation layers from non-Unixy environments. So when they say that things like "rename()" has to be all-or-nothing, it's to clarify that you can't emulate it as a "link and delete original" kind of operation (which old UNIX *did* do) and claim to be POSIX. Because while the end result of rename() and link()+unlink()might be similar, people did rely on that whole "use rename as a way to create an atomic marker in the filesystem" (which is a very traditional UNIX pattern). So "rename()" has to be atomic, and the legacy behavior of link+unlink is not valid in POSIX. Similarly, you can't implement "pread()" as a "lseek+read+lseek back", because that doesn't work if somebody else is doing another "pread()" on the same file descriptor concurrently. Again, people *did* implement exactly those kinds of implementations of "pread()", and yes, they were broken for both signals and for threading. So there's "atomicity" and then there is "atomicity". That "all or nothing" can be a very practical thing to describe *roughly* how it must work on a higher level, or it can be a theoretical "transactional" thing that works literally like a database where the operation happens in full and you must not see any intermediate state. And no, "write()" and friends have never ever been about some transactional operation where you can't see how the file grows as it is being written to. That kind of atomicity has simply never existed, not even in theory. So when you see POSIX saying that a "read()" system call is "atomic", you should *not* see it as a transaction thing, but see it in the historical context of "people used to do threading libraries in user space, and since they didn't want a big read() to block all other threads, they'd split it up into many smaller reads and now another thread *also* doing 'read()' system calls would see the data it read being not one contiguous region, but multiple regions where the file position changed in the middle". Similarly, a "read()" system call will not be interrupted by a signal in the middle, where the signal handler would do a "lseek()" or another "read()", and now the original "read()" data suddenly is affected. That's why things like that whole "f_pos is atomic" is a big deal. Because there literally were threading libraries (and badly emulated environments) where that *WASN'T* the case, and _that_ is why POSIX then talks about it. So think of POSIX not as some hard set of "this is exactly how things work and we describe every detail". Instead, treat it a bit like historians treat Herodotus - interpreting his histories by taking the issues of the time into account. POSIX is trying to clarify and document the problems of the time it was written, and taking other things for granted. Linus