From: Jiri Slaby <jslaby@suse.cz>
To: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org,
        "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
        Peter Zijlstra <peterz@infradead.org>, Jiri Slaby <jslaby@suse.cz>
Subject: [PATCH 3.12 42/72] compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release()
Date: Mon,  7 Nov 2016 14:04:49 +0100
Message-Id: <afe98b3b0e72a1238d8ef394195f3ce5aff27e05.1478523828.git.jslaby@suse.cz>
In-Reply-To: <0f3caac741164dcff670ae0f4d1cfcb0a7026a1c.1478523828.git.jslaby@suse.cz>
References: <0f3caac741164dcff670ae0f4d1cfcb0a7026a1c.1478523828.git.jslaby@suse.cz>
In-Reply-To: <cover.1478523828.git.jslaby@suse.cz>
References: <cover.1478523828.git.jslaby@suse.cz>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2287
Lines: 56

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

commit 536fa402221f09633e7c5801b327055ab716a363 upstream.

CPUs without single-byte and double-byte loads and stores place some
"interesting" requirements on concurrent code.  For example (adapted
from Peter Hurley's test code), suppose we have the following structure:

	struct foo {
		spinlock_t lock1;
		spinlock_t lock2;
		char a; /* Protected by lock1. */
		char b; /* Protected by lock2. */
	};
	struct foo *foop;

Of course, it is common (and good) practice to place data protected
by different locks in separate cache lines.  However, if the locks are
rarely acquired (for example, only in rare error cases), and there are
a great many instances of the data structure, then memory footprint can
trump false-sharing concerns, so that it can be better to place them in
the same cache cache line as above.

But if the CPU does not support single-byte loads and stores, a store
to foop->a will do a non-atomic read-modify-write operation on foop->b,
which will come as a nasty surprise to someone holding foop->lock2.  So we
now require CPUs to support single-byte and double-byte loads and stores.
Therefore, this commit adjusts the definition of __native_word() to allow
these sizes to be used by smp_load_acquire() and smp_store_release().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 include/linux/compiler.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 913532c0c140..f968eefaf1e8 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -362,7 +362,7 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
 
 /* Is this type a native word size -- useful for atomic operations */
 #ifndef __native_word
-# define __native_word(t) (sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
+# define __native_word(t) (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
 #endif
 
 /* Compile time object size, -1 for unknown */
-- 
2.10.2