Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1159062pxk; Fri, 18 Sep 2020 05:30:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyo+FMSrixCaD3XRtTL1dKwbV8mTQp/UXAayNgM+OoL7GVBWLdky4nOJKN43U12DwIibXBe X-Received: by 2002:a17:906:b0a:: with SMTP id u10mr34948512ejg.226.1600432235813; Fri, 18 Sep 2020 05:30:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600432235; cv=none; d=google.com; s=arc-20160816; b=U3tf60KNvOhMZ6+Pub1S+VsHdKmSnZsocX1AxHYMKmyzyUphDcjkPJ8Hflk/5994fk OF6WnFFe9D8VnybDKu0O/+VGd6TY149kxaKvmp+ab0ltWmy45Inraa7aHkLJjJ9Isr3c 2DbRtcr2OVfUrkZwy4Wwngluld/gUMT4D/9pG9Vesg5LkJHUK5x16FcgCUXP5ZYs9ERI 7qtrZ1c1549kcwFNDM2NjLhA+llzEDDCtDUFaMAA0ZtuKb9cWsFXd+VjwGYWeMqfSLvg CE5zUY+gjxWXYrNILOKw/gGNR2ojgQtaSRj93FHJepRQ0AdPsBxjFzsvrzPt+4Ab41v8 ayFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:message-id :in-reply-to:subject:cc:to:from:date:dkim-signature; bh=0fKfZIIzGiatJHAcexj7+1Kf9KYS0KSYRXI+7NGJPas=; b=BLC1kcg6TpF9uJ2TpXDvVFquxfMMMt3sLh8fDGQ7kJH/DrrwNOTiq6JCvfcnDwlHDK JjbzydwB3kCEojKL/qkRU4v+gzJM3zbuyx+FPIO1lMFIUykSQtKNSxH9yIAWusUlrbzk KBkQgxJcSceJSWutufgpxdTlOq+C6wZHgtPahIZN4cCuLNlH58rKmQzkPxixpHbepSXm PAJTQkPf7mOXJfPytyzH5l6m4ygeRfHgQSEJb0HVLB94w4gNYvYz0zobodyAXjHSR7fU /LUbC7UVqxSRSXdxLeTsmAW5GdoN5I+JLs3x3h6z/VNROuF2Lp287MXgi6/yjpkjj+Jk YQNQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hkcLP2eF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n4si1963235edo.589.2020.09.18.05.30.12; Fri, 18 Sep 2020 05:30:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hkcLP2eF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726457AbgIRM1D (ORCPT + 99 others); Fri, 18 Sep 2020 08:27:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:22991 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725955AbgIRM1D (ORCPT ); Fri, 18 Sep 2020 08:27:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600432021; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=0fKfZIIzGiatJHAcexj7+1Kf9KYS0KSYRXI+7NGJPas=; b=hkcLP2eFaHCvJ7KTScnbiFUxs5K2P98fnbXrnbjCLvYkGdCBwTqpflpxV2wE4tbSF0fike JM83TP+yix974C2ytrLaWvnFVNL1qNVQ3s6RtoHmBA/YHzcQfyTnKx2ywf6Gdo1O9s/oa1 RjQnCLOJMpgDtMqhnFjD7v+3F8X3uH0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-88-K_BGBupBP_WSZIuNASCbwQ-1; Fri, 18 Sep 2020 08:25:44 -0400 X-MC-Unique: K_BGBupBP_WSZIuNASCbwQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AD6128B8C31; Fri, 18 Sep 2020 12:25:31 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (file01.intranet.prod.int.rdu2.redhat.com [10.11.5.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D428183597; Fri, 18 Sep 2020 12:25:30 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (localhost [127.0.0.1]) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4) with ESMTP id 08ICPUcX006150; Fri, 18 Sep 2020 08:25:30 -0400 Received: from localhost (mpatocka@localhost) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4/Submit) with ESMTP id 08ICPS9s006146; Fri, 18 Sep 2020 08:25:28 -0400 X-Authentication-Warning: file01.intranet.prod.int.rdu2.redhat.com: mpatocka owned process doing -bs Date: Fri, 18 Sep 2020 08:25:28 -0400 (EDT) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Dan Williams cc: Linus Torvalds , Alexander Viro , Andrew Morton , Matthew Wilcox , Jan Kara , Eric Sandeen , Dave Chinner , Linux Kernel Mailing List , linux-fsdevel Subject: the "read" syscall sees partial effects of the "write" syscall In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi I'd like to ask about this problem: when we write to a file, the kernel takes the write inode lock. When we read from a file, no lock is taken - thus the read syscall can read data that are halfway modified by the write syscall. The standard specifies the effects of the write syscall are atomic - see this: https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_07 > 2.9.7 Thread Interactions with Regular File Operations > > All of the following functions shall be atomic with respect to each > other in the effects specified in POSIX.1-2017 when they operate on > regular files or symbolic links: > > chmod() fchownat() lseek() readv() unlink() > chown() fcntl() lstat() pwrite() unlinkat() > close() fstat() open() rename() utime() > creat() fstatat() openat() renameat() utimensat() > dup2() ftruncate() pread() stat() utimes() > fchmod() lchown() read() symlink() write() > fchmodat() link() readlink() symlinkat() writev() > fchown() linkat() readlinkat() truncate() > > If two threads each call one of these functions, each call shall either > see all of the specified effects of the other call, or none of them. The > requirement on the close() function shall also apply whenever a file > descriptor is successfully closed, however caused (for example, as a > consequence of calling close(), calling dup2(), or of process > termination). Should the read call take the read inode lock to make it atomic w.r.t. the write syscall? (I know - taking the read lock causes big performance hit due to cache line bouncing) I've created this program to test it - it has two threads, one writing and the other reading and verifying. When I run it on OpenBSD or FreeBSD, it passes, on Linux it fails with "we read modified bytes". Mikulas #include #include #include #include #include #include #define L 65536 static int h; static pthread_barrier_t barrier; static pthread_t thr; static char rpattern[L]; static char wpattern[L]; static void *reader(__attribute__((unused)) void *ptr) { while (1) { int r; size_t i; r = pthread_barrier_wait(&barrier); if (r > 0) fprintf(stderr, "pthread_barrier_wait: %s\n", strerror(r)), exit(1); r = pread(h, rpattern, L, 0); if (r != L) perror("pread"), exit(1); for (i = 0; i < L; i++) { if (rpattern[i] != rpattern[0]) fprintf(stderr, "we read modified bytes\n"), exit(1); } } return NULL; } int main(__attribute__((unused)) int argc, char *argv[]) { int r; h = open(argv[1], O_RDWR | O_CREAT | O_TRUNC, 0644); if (h < 0) perror("open"), exit(1); r = pwrite(h, wpattern, L, 0); if (r != L) perror("pwrite"), exit(1); r = pthread_barrier_init(&barrier, NULL, 2); if (r) fprintf(stderr, "pthread_barrier_init: %s\n", strerror(r)), exit(1); r = pthread_create(&thr, NULL, reader, NULL); if (r) fprintf(stderr, "pthread_create: %s\n", strerror(r)), exit(1); while (1) { size_t i; for (i = 0; i < L; i++) wpattern[i]++; r = pthread_barrier_wait(&barrier); if (r > 0) fprintf(stderr, "pthread_barrier_wait: %s\n", strerror(r)), exit(1); r = pwrite(h, wpattern, L, 0); if (r != L) perror("pwrite"), exit(1); } }