Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751648AbdIUKFe (ORCPT ); Thu, 21 Sep 2017 06:05:34 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40026 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750912AbdIUKFc (ORCPT ); Thu, 21 Sep 2017 06:05:32 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 301A4A1F46 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=mpatocka@redhat.com Date: Thu, 21 Sep 2017 06:05:25 -0400 (EDT) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Joe Lawrence cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Alexander Viro , "Luis R. Rodriguez" , Kees Cook , Michael Kerrisk , Randy Dunlap Subject: Re: [PATCH RFC 2/3] pipe: protect pipe_max_size access with a mutex In-Reply-To: <4121a697-d2fb-ec65-eea3-3c01592c8e08@redhat.com> Message-ID: References: <1504622676-2992-1-git-send-email-joe.lawrence@redhat.com> <1504622676-2992-3-git-send-email-joe.lawrence@redhat.com> <29f65845-40e5-839d-0bc7-79a5c1fa6f4b@redhat.com> <4121a697-d2fb-ec65-eea3-3c01592c8e08@redhat.com> User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 21 Sep 2017 10:05:32 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3701 Lines: 96 On Tue, 19 Sep 2017, Joe Lawrence wrote: > On 09/19/2017 03:53 AM, Mikulas Patocka wrote: > > On Fri, 15 Sep 2017, Joe Lawrence wrote: > > [ ... snip ... ] > >> Hi Mikulas, > >> > >> I'm not strong when it comes to memory barriers, but one of the > >> side-effects of using the mutex is that pipe_set_size() and > >> alloc_pipe_info() should have a consistent view of pipe_max_size. > >> > >> If I remove the mutex (and assume that I implement a custom > >> do_proc_dointvec "conv" callback), is it safe for these routines to > >> directly use pipe_max_size as they had done before? > >> > >> If not, is it safe to alias through a temporary stack variable (ie, > >> could the compiler re-read pipe_max_size multiple times in the function)? > >> > >> Would READ_ONCE() help in any way? > > > > Theoretically re-reading the variable is possible and you should use > > ACCESS_ONCE or READ_ONCE+WRITE_ONCE on that variable. > > > > In practice, ACCESS_ONCE/READ_ONCE/WRITE_ONCE is missing at a lot of > > kernel variables that could be modified asynchronously and no one is > > complaining about it and no one is making any systematic effort to fix it. > > > > That re-reading happens (I have some test code that makes the gcc > > optimizer re-read a variable), but it happens very rarely. > > This would be interesting to look at if you are willing to share (can > send offlist). struct s { unsigned a, b, c, d; }; unsigned fn(struct s *s) { unsigned a = s->a; s->b = a; asm("nop":::"ebx","ecx","edx","esi","edi","ebp"); s->c = a; return s->d; } This piece of code makes gcc read the variable s->a twice (although it is read only once in the source code). Compile it with -m32 -O2. The result is this: 00000000 : 0: 55 push %ebp 1: 57 push %edi 2: 56 push %esi 3: 53 push %ebx 4: 8b 44 24 14 mov 0x14(%esp),%eax 8: 8b 10 mov (%eax),%edx <--- 1st load of s->a a: 89 50 04 mov %edx,0x4(%eax) d: 90 nop e: 8b 08 mov (%eax),%ecx <--- 2nd load of s->a 10: 89 48 08 mov %ecx,0x8(%eax) 13: 8b 40 0c mov 0xc(%eax),%eax 16: 5b pop %ebx 17: 5e pop %esi 18: 5f pop %edi 19: 5d pop %ebp 1a: c3 ret > > Another theoretical problem is that when reading or writing a variable > > without ACCESS_ONCE, the compiler could read and write the variable using > > multiple smaller memory accesses. But in practice, it happens only on some > > non-common architectures. > > Smaller access than word size? Yes. The file Documentation/memory-barriers.txt says: (*) For aligned memory locations whose size allows them to be accessed with a single memory-reference instruction, prevents "load tearing" and "store tearing," in which a single large access is replaced by multiple smaller accesses. For example, given an architecture having 16-bit store instructions with 7-bit immediate fields, the compiler might be tempted to use two 16-bit store-immediate instructions to implement the following 32-bit store: p = 0x00010002; Please note that GCC really does use this sort of optimization, which is not surprising given that it would likely take more than two instructions to build the constant and then store it. But it doesn't say on which architecture gcc does it. Mikulas