2014-12-10 14:23:47

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()

v1 -> v2:
- moved pagefault_count to the end of thread_info for all archs that would have
required manually calculating asm-offsets - to keep changes minimal.
- remove unlikely() from "mm, uaccess: trigger might_sleep() in" and keep
changes minimal (in_atomic() -> pagefault_disabled())

----

I recently discovered that might_fault() doesn't call might_sleep() anymore.
Therefore bugs like:
spin_lock(&lock);
rc = copy_to_user(...);
spin_unlock(&lock);
would not be detected with CONFIG_DEBUG_ATOMIC_SLEEP. The code was changed to
disable false positives for code like:
pagefault_disable();
rc = copy_to_user(...);
pagefault_enable();

Until now, pagefault_disable() and pagefault_enable() simply modified the
preempt count, therefore telling the pagefault handler that the context is
atomic and sleeping is disallowed.

In order to reenable might_sleep() checks for the correct path, we need a way to
detect whether we run in a pagefault_disable() context.

This series therefore introduces a separate pagefault_count and uses it to count
the levels of pagefault_disable() per thread. might_sleep() checks are
reactivated for the !pagefault_disable() path.

So this should now work:
spin_lock(&lock); /* also if left away */
pagefault_disable()
rc = copy_to_user(...);
pagefault_enable();
spin_unlock(&lock);
And this should report a warning again:
spin_lock(&lock);
rc = copy_to_user(...);
spin_unlock(&lock);

Please note that this series will not completely split the handling of
pagefault_disable() and the preempt count. This will be done in another series.
Purpose of this series is to reenable might_sleep() checks for might_fault(),
avoiding to produce false positives.

Cross compiled on powerpc, arm, sparc, sparc64, arm64, x86_64, i386, mips,
alpha, ia64, xtensa, m68k, microblaze.

Tested on s390.


David Hildenbrand (5):
uaccess: add pagefault_count to thread_info
uaccess: count pagefault_disable() levels in pagefault_count
mm, uaccess: trigger might_sleep() in might_fault() when pagefaults
are disabled
uaccess: clarify that uaccess may only sleep if pagefaults are not
disabled
uaccess: CONFIG_DEBUG_PAGEFAULT_COUNT to debug pagefault_count

arch/alpha/include/asm/thread_info.h | 1 +
arch/arc/include/asm/thread_info.h | 1 +
arch/arm/include/asm/thread_info.h | 1 +
arch/arm64/include/asm/thread_info.h | 1 +
arch/avr32/include/asm/thread_info.h | 1 +
arch/avr32/include/asm/uaccess.h | 12 +++++---
arch/blackfin/include/asm/thread_info.h | 1 +
arch/c6x/include/asm/thread_info.h | 1 +
arch/cris/include/asm/thread_info.h | 1 +
arch/frv/include/asm/thread_info.h | 1 +
arch/hexagon/include/asm/thread_info.h | 1 +
arch/hexagon/include/asm/uaccess.h | 3 +-
arch/ia64/include/asm/thread_info.h | 1 +
arch/m32r/include/asm/thread_info.h | 1 +
arch/m32r/include/asm/uaccess.h | 30 ++++++++++++------
arch/m68k/include/asm/thread_info.h | 1 +
arch/metag/include/asm/thread_info.h | 1 +
arch/microblaze/include/asm/thread_info.h | 1 +
arch/microblaze/include/asm/uaccess.h | 6 ++--
arch/mips/include/asm/thread_info.h | 1 +
arch/mips/include/asm/uaccess.h | 45 ++++++++++++++++++---------
arch/mn10300/include/asm/thread_info.h | 1 +
arch/openrisc/include/asm/thread_info.h | 1 +
arch/parisc/include/asm/thread_info.h | 1 +
arch/powerpc/include/asm/thread_info.h | 1 +
arch/s390/include/asm/thread_info.h | 1 +
arch/s390/include/asm/uaccess.h | 15 ++++++---
arch/score/include/asm/thread_info.h | 1 +
arch/score/include/asm/uaccess.h | 15 ++++++---
arch/sh/include/asm/thread_info.h | 1 +
arch/sparc/include/asm/thread_info_32.h | 1 +
arch/sparc/include/asm/thread_info_64.h | 1 +
arch/tile/include/asm/thread_info.h | 1 +
arch/tile/include/asm/uaccess.h | 21 ++++++++-----
arch/um/include/asm/thread_info.h | 1 +
arch/unicore32/include/asm/thread_info.h | 1 +
arch/x86/include/asm/thread_info.h | 1 +
arch/x86/include/asm/uaccess.h | 15 ++++++---
arch/x86/include/asm/uaccess_32.h | 6 ++--
arch/x86/lib/usercopy_32.c | 6 ++--
arch/xtensa/include/asm/thread_info.h | 1 +
include/linux/kernel.h | 3 +-
include/linux/uaccess.h | 51 ++++++++++++++++++++++++++-----
lib/Kconfig.debug | 9 ++++++
lib/strnlen_user.c | 6 ++--
mm/maccess.c | 11 +++++++
mm/memory.c | 18 ++++-------
47 files changed, 222 insertions(+), 80 deletions(-)

--
1.8.5.5


2014-12-10 14:23:52

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v2 5/5] uaccess: CONFIG_DEBUG_PAGEFAULT_COUNT to debug pagefault_count

This patch introduces CONFIG_DEBUG_PAGEFAULT_COUNT, to detect over-/underflows
in the pagefault_count resulting from a wrong usage of pagefault_enable() and
pagefault_disable().

Signed-off-by: David Hildenbrand <[email protected]>
---
include/linux/uaccess.h | 8 +++++++-
lib/Kconfig.debug | 9 +++++++++
mm/maccess.c | 11 +++++++++++
3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 1dfc678..6ffb90b 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -20,11 +20,17 @@ static __always_inline void pagefault_count_inc(void)
(*pagefault_count_ptr())++;
}

-static __always_inline void pagefault_count_dec(void)
+static __always_inline void __pagefault_count_dec(void)
{
(*pagefault_count_ptr())--;
}

+#ifdef CONFIG_DEBUG_PAGEFAULT_COUNT
+extern void pagefault_count_dec(void);
+#else
+#define pagefault_count_dec() __pagefault_count_dec()
+#endif
+
/*
* These routines enable/disable the pagefault handler. If disabled, it will
* not take any locks and go straight to the fixup table.
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index d780351..9dea6e0 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -860,6 +860,15 @@ config DEBUG_PREEMPT
if kernel code uses it in a preemption-unsafe way. Also, the kernel
will detect preemption count underflows.

+config DEBUG_PAGEFAULT_COUNT
+ bool "Debug pagefault_disable / pagefault_enable"
+ depends on DEBUG_KERNEL
+ default n
+ help
+ If you say Y here then the kernel will detect pagefault count
+ over-/underflows and therefore non-matching pagefault_enable() and
+ pagefault_disable() calls.
+
menu "Lock Debugging (spinlocks, mutexes, etc...)"

config DEBUG_RT_MUTEXES
diff --git a/mm/maccess.c b/mm/maccess.c
index d53adf9..4b72aa1 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -5,6 +5,17 @@
#include <linux/mm.h>
#include <linux/uaccess.h>

+#ifdef CONFIG_DEBUG_PAGEFAULT_COUNT
+void pagefault_count_dec(void)
+{
+ __pagefault_count_dec();
+
+ /* underflow / previous overflow ? */
+ WARN_ON(pagefault_count() < 0);
+}
+EXPORT_SYMBOL(pagefault_count_dec);
+#endif
+
/**
* probe_kernel_read(): safely attempt to read from a location
* @dst: pointer to the buffer that shall take the data
--
1.8.5.5

2014-12-10 14:24:20

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v2 4/5] uaccess: clarify that uaccess may only sleep if pagefaults are not disabled

In general, non-atomic variants of user access functions may not sleep if
pagefaults are disabled.

Let's update all relevant comments in uaccess code. This also refelects the
might_sleep() checks in might_fault().

Signed-off-by: David Hildenbrand <[email protected]>
---
arch/avr32/include/asm/uaccess.h | 12 ++++++----
arch/hexagon/include/asm/uaccess.h | 3 ++-
arch/m32r/include/asm/uaccess.h | 30 +++++++++++++++--------
arch/microblaze/include/asm/uaccess.h | 6 +++--
arch/mips/include/asm/uaccess.h | 45 +++++++++++++++++++++++------------
arch/s390/include/asm/uaccess.h | 15 ++++++++----
arch/score/include/asm/uaccess.h | 15 ++++++++----
arch/tile/include/asm/uaccess.h | 21 ++++++++++------
arch/x86/include/asm/uaccess.h | 15 ++++++++----
arch/x86/include/asm/uaccess_32.h | 6 +++--
arch/x86/lib/usercopy_32.c | 6 +++--
lib/strnlen_user.c | 6 +++--
12 files changed, 120 insertions(+), 60 deletions(-)

diff --git a/arch/avr32/include/asm/uaccess.h b/arch/avr32/include/asm/uaccess.h
index 245b2ee..6b6dd58 100644
--- a/arch/avr32/include/asm/uaccess.h
+++ b/arch/avr32/include/asm/uaccess.h
@@ -97,7 +97,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -116,7 +117,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -136,7 +138,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -158,7 +161,8 @@ static inline __kernel_size_t __copy_from_user(void *to,
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
diff --git a/arch/hexagon/include/asm/uaccess.h b/arch/hexagon/include/asm/uaccess.h
index e4127e4..af98618 100644
--- a/arch/hexagon/include/asm/uaccess.h
+++ b/arch/hexagon/include/asm/uaccess.h
@@ -36,7 +36,8 @@
* @addr: User space pointer to start of block to check
* @size: Size of block to check
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Checks if a pointer to a block of memory in user space is valid.
*
diff --git a/arch/m32r/include/asm/uaccess.h b/arch/m32r/include/asm/uaccess.h
index 84fe7ba..83bfa33 100644
--- a/arch/m32r/include/asm/uaccess.h
+++ b/arch/m32r/include/asm/uaccess.h
@@ -91,7 +91,8 @@ static inline void set_fs(mm_segment_t s)
* @addr: User space pointer to start of block to check
* @size: Size of block to check
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Checks if a pointer to a block of memory in user space is valid.
*
@@ -155,7 +156,8 @@ extern int fixup_exception(struct pt_regs *regs);
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -175,7 +177,8 @@ extern int fixup_exception(struct pt_regs *regs);
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -194,7 +197,8 @@ extern int fixup_exception(struct pt_regs *regs);
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -274,7 +278,8 @@ do { \
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -568,7 +573,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
* @from: Source address, in kernel space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from kernel space to user space. Caller must check
* the specified block with access_ok() before calling this function.
@@ -588,7 +594,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
* @from: Source address, in kernel space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from kernel space to user space.
*
@@ -606,7 +613,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to kernel space. Caller must check
* the specified block with access_ok() before calling this function.
@@ -626,7 +634,8 @@ unsigned long __generic_copy_from_user(void *, const void __user *, unsigned lon
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to kernel space.
*
@@ -677,7 +686,8 @@ unsigned long clear_user(void __user *mem, unsigned long len);
* strlen_user: - Get the size of a string in user space.
* @str: The string to measure.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Get the size of a NUL-terminated string in user space.
*
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index 59a89a6..53bfbb8 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -178,7 +178,8 @@ extern long __user_bad(void);
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -290,7 +291,8 @@ extern long __user_bad(void);
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 22a5624..eded117 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -103,7 +103,8 @@ extern u64 __ua_limit;
* @addr: User space pointer to start of block to check
* @size: Size of block to check
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Checks if a pointer to a block of memory in user space is valid.
*
@@ -138,7 +139,8 @@ extern u64 __ua_limit;
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -157,7 +159,8 @@ extern u64 __ua_limit;
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -177,7 +180,8 @@ extern u64 __ua_limit;
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -199,7 +203,8 @@ extern u64 __ua_limit;
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -498,7 +503,8 @@ extern void __put_user_unknown(void);
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -517,7 +523,8 @@ extern void __put_user_unknown(void);
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -537,7 +544,8 @@ extern void __put_user_unknown(void);
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -559,7 +567,8 @@ extern void __put_user_unknown(void);
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -815,7 +824,8 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
* @from: Source address, in kernel space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from kernel space to user space. Caller must check
* the specified block with access_ok() before calling this function.
@@ -888,7 +898,8 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
* @from: Source address, in kernel space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from kernel space to user space.
*
@@ -1075,7 +1086,8 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to kernel space. Caller must check
* the specified block with access_ok() before calling this function.
@@ -1107,7 +1119,8 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to kernel space.
*
@@ -1356,7 +1369,8 @@ static inline long __strlen_user(const char __user *s)
* strlen_user: - Get the size of a string in user space.
* @str: The string to measure.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Get the size of a NUL-terminated string in user space.
*
@@ -1425,7 +1439,8 @@ static inline long __strnlen_user(const char __user *s, long n)
* strnlen_user: - Get the size of a string in user space.
* @str: The string to measure.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Get the size of a NUL-terminated string in user space.
*
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index cd4c68e..1291da5 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -98,7 +98,8 @@ static inline unsigned long extable_fixup(const struct exception_table_entry *x)
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to kernel space. Caller must check
* the specified block with access_ok() before calling this function.
@@ -118,7 +119,8 @@ unsigned long __must_check __copy_from_user(void *to, const void __user *from,
* @from: Source address, in kernel space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from kernel space to user space. Caller must check
* the specified block with access_ok() before calling this function.
@@ -264,7 +266,8 @@ int __get_user_bad(void) __attribute__((noreturn));
* @from: Source address, in kernel space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from kernel space to user space.
*
@@ -290,7 +293,8 @@ __compiletime_warning("copy_from_user() buffer size is not provably correct")
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to kernel space.
*
@@ -348,7 +352,8 @@ static inline unsigned long strnlen_user(const char __user *src, unsigned long n
* strlen_user: - Get the size of a string in user space.
* @str: The string to measure.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Get the size of a NUL-terminated string in user space.
*
diff --git a/arch/score/include/asm/uaccess.h b/arch/score/include/asm/uaccess.h
index ab66ddd..010a800 100644
--- a/arch/score/include/asm/uaccess.h
+++ b/arch/score/include/asm/uaccess.h
@@ -36,7 +36,8 @@
* @addr: User space pointer to start of block to check
* @size: Size of block to check
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Checks if a pointer to a block of memory in user space is valid.
*
@@ -61,7 +62,8 @@
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -79,7 +81,8 @@
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -98,7 +101,8 @@
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -119,7 +123,8 @@
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
diff --git a/arch/tile/include/asm/uaccess.h b/arch/tile/include/asm/uaccess.h
index b6cde32..9efe668 100644
--- a/arch/tile/include/asm/uaccess.h
+++ b/arch/tile/include/asm/uaccess.h
@@ -78,7 +78,8 @@ int __range_ok(unsigned long addr, unsigned long size);
* @addr: User space pointer to start of block to check
* @size: Size of block to check
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Checks if a pointer to a block of memory in user space is valid.
*
@@ -192,7 +193,8 @@ extern int __get_user_bad(void)
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -272,7 +274,8 @@ extern int __put_user_bad(void)
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -327,7 +330,8 @@ extern int __put_user_bad(void)
* @from: Source address, in kernel space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from kernel space to user space. Caller must check
* the specified block with access_ok() before calling this function.
@@ -363,7 +367,8 @@ copy_to_user(void __user *to, const void *from, unsigned long n)
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to kernel space. Caller must check
* the specified block with access_ok() before calling this function.
@@ -434,7 +439,8 @@ static inline unsigned long __must_check copy_from_user(void *to,
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to user space. Caller must check
* the specified blocks with access_ok() before calling this function.
@@ -466,7 +472,8 @@ copy_in_user(void __user *to, const void __user *from, unsigned long n)
* strlen_user: - Get the size of a string in user space.
* @str: The string to measure.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Get the size of a NUL-terminated string in user space.
*
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 0d592e0..014d8cb 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -74,7 +74,8 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
* @addr: User space pointer to start of block to check
* @size: Size of block to check
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Checks if a pointer to a block of memory in user space is valid.
*
@@ -145,7 +146,8 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -240,7 +242,8 @@ extern void __put_user_8(void);
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
@@ -455,7 +458,8 @@ struct __large_struct { unsigned long buf[100]; };
* @x: Variable to store result.
* @ptr: Source address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple variable from user space to kernel
* space. It supports simple types like char and int, but not larger
@@ -479,7 +483,8 @@ struct __large_struct { unsigned long buf[100]; };
* @x: Value to copy to user space.
* @ptr: Destination address, in user space.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* This macro copies a single simple value from kernel space to user
* space. It supports simple types like char and int, but not larger
diff --git a/arch/x86/include/asm/uaccess_32.h b/arch/x86/include/asm/uaccess_32.h
index 3c03a5d..ae5b37f 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -70,7 +70,8 @@ __copy_to_user_inatomic(void __user *to, const void *from, unsigned long n)
* @from: Source address, in kernel space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from kernel space to user space. Caller must check
* the specified block with access_ok() before calling this function.
@@ -117,7 +118,8 @@ __copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to kernel space. Caller must check
* the specified block with access_ok() before calling this function.
diff --git a/arch/x86/lib/usercopy_32.c b/arch/x86/lib/usercopy_32.c
index e2f5e21..64ba86e 100644
--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -647,7 +647,8 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocache_nozero);
* @from: Source address, in kernel space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from kernel space to user space.
*
@@ -668,7 +669,8 @@ EXPORT_SYMBOL(_copy_to_user);
* @from: Source address, in user space.
* @n: Number of bytes to copy.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Copy data from user space to kernel space.
*
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index a28df52..0013abd 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -84,7 +84,8 @@ static inline long do_strnlen_user(const char __user *src, unsigned long count,
* @str: The string to measure.
* @count: Maximum count (including NUL character)
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Get the size of a NUL-terminated string in user space.
*
@@ -113,7 +114,8 @@ EXPORT_SYMBOL(strnlen_user);
* strlen_user: - Get the size of a user string INCLUDING final NUL.
* @str: The string to measure.
*
- * Context: User context only. This function may sleep.
+ * Context: User context only. This function may sleep if pagefaults are not
+ * disabled.
*
* Get the size of a NUL-terminated string in user space.
*
--
1.8.5.5

2014-12-10 14:24:40

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v2 3/5] mm, uaccess: trigger might_sleep() in might_fault() when pagefaults are disabled

Commit 662bbcb2747c ("mm, sched: Allow uaccess in atomic with
pagefault_disable()") removed might_sleep() checks for all user access code
(that uses might_fault()).

The reason was to disable wrong "sleep in atomic" warnings in the following
scenario:
pagefault_disable()
rc = copy_to_user(...)
pagefault_enable()

Which is valid, as pagefault_disable() increments the preempt counter and
therefore disables the pagefault handler. copy_to_user() will not sleep and return
an invalid return code if a page is not available.

However, as all might_sleep() checks are removed, CONFIG_DEBUG_ATOMIC_SLEEP
would no longer detect the following scenario:
spin_lock(&lock);
rc = copy_to_user(...)
spin_unlock(&lock)

If the kernel is compiled with preemption turned on, the preempt counter would
be incremented and copy_to_user() would never sleep. However, with preemption
turned off, the preempt counter will not be touched, we will therefore sleep in
atomic context. We really want to enable CONFIG_DEBUG_ATOMIC_SLEEP checks for
user access functions again, otherwise horrible deadlocks might be hard to debug.

Root of all evil is that pagefault_disable() acted almost as preempt_disable(),
depending on preemption being turned on/off.

As we now have pagefault_disabled(), we can use it to distingusih whether user
acces functions might sleep.

Convert might_fault() into a makro that calls __might_fault(), to allow proper
file + line messages in case of a might_sleep() warning. We can't move the code
directly into kernel.h for now, as that results in ugly header recursions we
can't avoid for now.

Signed-off-by: David Hildenbrand <[email protected]>
---
include/linux/kernel.h | 3 ++-
mm/memory.c | 18 ++++++------------
2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 446d76a..7e65a55 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -232,7 +232,8 @@ static inline u32 reciprocal_scale(u32 val, u32 ep_ro)

#if defined(CONFIG_MMU) && \
(defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP))
-void might_fault(void);
+#define might_fault() __might_fault(__FILE__, __LINE__)
+void __might_fault(const char *file, int line);
#else
static inline void might_fault(void) { }
#endif
diff --git a/mm/memory.c b/mm/memory.c
index 0b3f6c7..563720a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3686,7 +3686,7 @@ void print_vma_addr(char *prefix, unsigned long ip)
}

#if defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)
-void might_fault(void)
+void __might_fault(const char *file, int line)
{
/*
* Some code (nfs/sunrpc) uses socket ops on kernel memory while
@@ -3696,21 +3696,15 @@ void might_fault(void)
*/
if (segment_eq(get_fs(), KERNEL_DS))
return;
-
- /*
- * it would be nicer only to annotate paths which are not under
- * pagefault_disable, however that requires a larger audit and
- * providing helpers like get_user_atomic.
- */
- if (in_atomic())
+ if (pagefault_disabled())
return;
-
- __might_sleep(__FILE__, __LINE__, 0);
-
+ __might_sleep(file, line, 0);
+#if defined(CONFIG_DEBUG_ATOMIC_SLEEP)
if (current->mm)
might_lock_read(&current->mm->mmap_sem);
+#endif
}
-EXPORT_SYMBOL(might_fault);
+EXPORT_SYMBOL(__might_fault);
#endif

#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS)
--
1.8.5.5

2014-12-10 14:23:44

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v2 1/5] uaccess: add pagefault_count to thread_info

This patch adds the pagefault_count to the thread_info of all
architectures. It will be used to count the pagefault_disable() levels
on a per-thread basis.

We are not reusing the preempt_count as this is per cpu on x86 and we want to
demangle pagefault_disable() from preemption in the future.

The new counter is added directly below the preempt_count, except for archs
relying on a manual calculation of asm offsets - to minimize the changes.

Signed-off-by: David Hildenbrand <[email protected]>
---
arch/alpha/include/asm/thread_info.h | 1 +
arch/arc/include/asm/thread_info.h | 1 +
arch/arm/include/asm/thread_info.h | 1 +
arch/arm64/include/asm/thread_info.h | 1 +
arch/avr32/include/asm/thread_info.h | 1 +
arch/blackfin/include/asm/thread_info.h | 1 +
arch/c6x/include/asm/thread_info.h | 1 +
arch/cris/include/asm/thread_info.h | 1 +
arch/frv/include/asm/thread_info.h | 1 +
arch/hexagon/include/asm/thread_info.h | 1 +
arch/ia64/include/asm/thread_info.h | 1 +
arch/m32r/include/asm/thread_info.h | 1 +
arch/m68k/include/asm/thread_info.h | 1 +
arch/metag/include/asm/thread_info.h | 1 +
arch/microblaze/include/asm/thread_info.h | 1 +
arch/mips/include/asm/thread_info.h | 1 +
arch/mn10300/include/asm/thread_info.h | 1 +
arch/openrisc/include/asm/thread_info.h | 1 +
arch/parisc/include/asm/thread_info.h | 1 +
arch/powerpc/include/asm/thread_info.h | 1 +
arch/s390/include/asm/thread_info.h | 1 +
arch/score/include/asm/thread_info.h | 1 +
arch/sh/include/asm/thread_info.h | 1 +
arch/sparc/include/asm/thread_info_32.h | 1 +
arch/sparc/include/asm/thread_info_64.h | 1 +
arch/tile/include/asm/thread_info.h | 1 +
arch/um/include/asm/thread_info.h | 1 +
arch/unicore32/include/asm/thread_info.h | 1 +
arch/x86/include/asm/thread_info.h | 1 +
arch/xtensa/include/asm/thread_info.h | 1 +
30 files changed, 30 insertions(+)

diff --git a/arch/alpha/include/asm/thread_info.h b/arch/alpha/include/asm/thread_info.h
index 48bbea6..1830671 100644
--- a/arch/alpha/include/asm/thread_info.h
+++ b/arch/alpha/include/asm/thread_info.h
@@ -22,6 +22,7 @@ struct thread_info {
mm_segment_t addr_limit; /* thread address space */
unsigned cpu; /* current CPU */
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count;/* pagefault_disable() levels */
unsigned int status; /* thread-synchronous flags */

int bpt_nsaved;
diff --git a/arch/arc/include/asm/thread_info.h b/arch/arc/include/asm/thread_info.h
index 02bc5ec..2fde704 100644
--- a/arch/arc/include/asm/thread_info.h
+++ b/arch/arc/include/asm/thread_info.h
@@ -41,6 +41,7 @@
struct thread_info {
unsigned long flags; /* low level flags */
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count; /* pagefault_disable() levels */
struct task_struct *task; /* main task structure */
mm_segment_t addr_limit; /* thread address space */
struct exec_domain *exec_domain;/* execution domain */
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index ce73ab6..bf47d2d 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -51,6 +51,7 @@ struct cpu_context_save {
struct thread_info {
unsigned long flags; /* low level flags */
int preempt_count; /* 0 => preemptable, <0 => bug */
+ int pagefault_count;/* pagefault_disable() levels */
mm_segment_t addr_limit; /* address limit */
struct task_struct *task; /* main task structure */
struct exec_domain *exec_domain; /* execution domain */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 459bf8e..2469f15 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -50,6 +50,7 @@ struct thread_info {
struct exec_domain *exec_domain; /* execution domain */
struct restart_block restart_block;
int preempt_count; /* 0 => preemptable, <0 => bug */
+ int pagefault_count;/* pagefault_disable() levels */
int cpu; /* cpu */
};

diff --git a/arch/avr32/include/asm/thread_info.h b/arch/avr32/include/asm/thread_info.h
index a978f3f..0c1d6f7 100644
--- a/arch/avr32/include/asm/thread_info.h
+++ b/arch/avr32/include/asm/thread_info.h
@@ -25,6 +25,7 @@ struct thread_info {
unsigned long flags; /* low level flags */
__u32 cpu;
__s32 preempt_count; /* 0 => preemptable, <0 => BUG */
+ __s32 pagefault_count;/* pagefault_disable() levels */
__u32 rar_saved; /* return address... */
__u32 rsr_saved; /* ...and status register
saved by debug handler
diff --git a/arch/blackfin/include/asm/thread_info.h b/arch/blackfin/include/asm/thread_info.h
index 55f473b..3ba26aa 100644
--- a/arch/blackfin/include/asm/thread_info.h
+++ b/arch/blackfin/include/asm/thread_info.h
@@ -41,6 +41,7 @@ struct thread_info {
unsigned long flags; /* low level flags */
int cpu; /* cpu we're on */
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count; /* pagefault_disable() levels */
mm_segment_t addr_limit; /* address limit */
struct restart_block restart_block;
#ifndef CONFIG_SMP
diff --git a/arch/c6x/include/asm/thread_info.h b/arch/c6x/include/asm/thread_info.h
index d4e9ef8..6b2dcac 100644
--- a/arch/c6x/include/asm/thread_info.h
+++ b/arch/c6x/include/asm/thread_info.h
@@ -44,6 +44,7 @@ struct thread_info {
unsigned long flags; /* low level flags */
int cpu; /* cpu we're on */
int preempt_count; /* 0 = preemptable, <0 = BUG */
+ int pagefault_count;/* pagefault_disable() levels */
mm_segment_t addr_limit; /* thread address space */
struct restart_block restart_block;
};
diff --git a/arch/cris/include/asm/thread_info.h b/arch/cris/include/asm/thread_info.h
index 55dede1..3356902 100644
--- a/arch/cris/include/asm/thread_info.h
+++ b/arch/cris/include/asm/thread_info.h
@@ -32,6 +32,7 @@ struct thread_info {
unsigned long flags; /* low level flags */
__u32 cpu; /* current CPU */
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count;/* pagefault_disable() levels */
__u32 tls; /* TLS for this thread */

mm_segment_t addr_limit; /* thread address space:
diff --git a/arch/frv/include/asm/thread_info.h b/arch/frv/include/asm/thread_info.h
index af29e17..79a97ee 100644
--- a/arch/frv/include/asm/thread_info.h
+++ b/arch/frv/include/asm/thread_info.h
@@ -36,6 +36,7 @@ struct thread_info {
unsigned long status; /* thread-synchronous flags */
__u32 cpu; /* current CPU */
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count;/* pagefault_disable() levels */

mm_segment_t addr_limit; /* thread address space:
* 0-0xBFFFFFFF for user-thead
diff --git a/arch/hexagon/include/asm/thread_info.h b/arch/hexagon/include/asm/thread_info.h
index a59dad3..d54042e 100644
--- a/arch/hexagon/include/asm/thread_info.h
+++ b/arch/hexagon/include/asm/thread_info.h
@@ -51,6 +51,7 @@ struct thread_info {
unsigned long flags; /* low level flags */
__u32 cpu; /* current cpu */
int preempt_count; /* 0=>preemptible,<0=>BUG */
+ int pagefault_count;/* pagefault_disable() levels */
mm_segment_t addr_limit; /* segmentation sux */
/*
* used for syscalls somehow;
diff --git a/arch/ia64/include/asm/thread_info.h b/arch/ia64/include/asm/thread_info.h
index 5b17418..14f128c 100644
--- a/arch/ia64/include/asm/thread_info.h
+++ b/arch/ia64/include/asm/thread_info.h
@@ -27,6 +27,7 @@ struct thread_info {
__u32 status; /* Thread synchronous flags */
mm_segment_t addr_limit; /* user-level address space limit */
int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */
+ int pagefault_count; /* pagefault_disable() levels */
struct restart_block restart_block;
#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
__u64 ac_stamp;
diff --git a/arch/m32r/include/asm/thread_info.h b/arch/m32r/include/asm/thread_info.h
index 0017170..bf14efb 100644
--- a/arch/m32r/include/asm/thread_info.h
+++ b/arch/m32r/include/asm/thread_info.h
@@ -35,6 +35,7 @@ struct thread_info {
0-0xFFFFFFFF for kernel-thread
*/
struct restart_block restart_block;
+ int pagefault_count;/* pagefault_disable() levels */

__u8 supervisor_stack[0];
};
diff --git a/arch/m68k/include/asm/thread_info.h b/arch/m68k/include/asm/thread_info.h
index 21a4784..5a6a203 100644
--- a/arch/m68k/include/asm/thread_info.h
+++ b/arch/m68k/include/asm/thread_info.h
@@ -29,6 +29,7 @@ struct thread_info {
struct exec_domain *exec_domain; /* execution domain */
mm_segment_t addr_limit; /* thread address space */
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count;/* pagefault_disable() levels */
__u32 cpu; /* should always be 0 on m68k */
unsigned long tp_value; /* thread pointer */
struct restart_block restart_block;
diff --git a/arch/metag/include/asm/thread_info.h b/arch/metag/include/asm/thread_info.h
index 4771133..91729f5 100644
--- a/arch/metag/include/asm/thread_info.h
+++ b/arch/metag/include/asm/thread_info.h
@@ -33,6 +33,7 @@ struct thread_info {
unsigned long status; /* thread-synchronous flags */
u32 cpu; /* current CPU */
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count; /* pagefault_disable() levels */

mm_segment_t addr_limit; /* thread address space */
struct restart_block restart_block;
diff --git a/arch/microblaze/include/asm/thread_info.h b/arch/microblaze/include/asm/thread_info.h
index 8c9d365..f905b02 100644
--- a/arch/microblaze/include/asm/thread_info.h
+++ b/arch/microblaze/include/asm/thread_info.h
@@ -70,6 +70,7 @@ struct thread_info {
unsigned long status; /* thread-synchronous flags */
__u32 cpu; /* current CPU */
__s32 preempt_count; /* 0 => preemptable,< 0 => BUG*/
+ __s32 pagefault_count; /* pagefault_disable() levels */
mm_segment_t addr_limit; /* thread address space */
struct restart_block restart_block;

diff --git a/arch/mips/include/asm/thread_info.h b/arch/mips/include/asm/thread_info.h
index 7de8658..f9f27ac 100644
--- a/arch/mips/include/asm/thread_info.h
+++ b/arch/mips/include/asm/thread_info.h
@@ -28,6 +28,7 @@ struct thread_info {
unsigned long tp_value; /* thread pointer */
__u32 cpu; /* current CPU */
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count;/* pagefault_disable() levels */

mm_segment_t addr_limit; /*
* thread address space limit:
diff --git a/arch/mn10300/include/asm/thread_info.h b/arch/mn10300/include/asm/thread_info.h
index bf280ea..f6c03a5 100644
--- a/arch/mn10300/include/asm/thread_info.h
+++ b/arch/mn10300/include/asm/thread_info.h
@@ -45,6 +45,7 @@ struct thread_info {
unsigned long flags; /* low level flags */
__u32 cpu; /* current CPU */
__s32 preempt_count; /* 0 => preemptable, <0 => BUG */
+ __s32 pagefault_count;/* pagefault_disable() levels */

mm_segment_t addr_limit; /* thread address space:
0-0xBFFFFFFF for user-thead
diff --git a/arch/openrisc/include/asm/thread_info.h b/arch/openrisc/include/asm/thread_info.h
index d797acc..bdabd6e 100644
--- a/arch/openrisc/include/asm/thread_info.h
+++ b/arch/openrisc/include/asm/thread_info.h
@@ -52,6 +52,7 @@ struct thread_info {
unsigned long flags; /* low level flags */
__u32 cpu; /* current CPU */
__s32 preempt_count; /* 0 => preemptable, <0 => BUG */
+ __s32 pagefault_count;/* pagefault_disable() levels */

mm_segment_t addr_limit; /* thread address space:
0-0x7FFFFFFF for user-thead
diff --git a/arch/parisc/include/asm/thread_info.h b/arch/parisc/include/asm/thread_info.h
index a846118..e37b76b 100644
--- a/arch/parisc/include/asm/thread_info.h
+++ b/arch/parisc/include/asm/thread_info.h
@@ -14,6 +14,7 @@ struct thread_info {
mm_segment_t addr_limit; /* user-level address space limit */
__u32 cpu; /* current CPU */
int preempt_count; /* 0=premptable, <0=BUG; will also serve as bh-counter */
+ int pagefault_count; /* pagefault_disable() levels */
struct restart_block restart_block;
};

diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index b034ecd..e8585fd 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -43,6 +43,7 @@ struct thread_info {
int cpu; /* cpu we're on */
int preempt_count; /* 0 => preemptable,
<0 => BUG */
+ int pagefault_count; /* pagefault_disable() levels */
struct restart_block restart_block;
unsigned long local_flags; /* private flags for thread */

diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h
index 4d62fd5..bbf0513f 100644
--- a/arch/s390/include/asm/thread_info.h
+++ b/arch/s390/include/asm/thread_info.h
@@ -39,6 +39,7 @@ struct thread_info {
unsigned long sys_call_table; /* System call table address */
unsigned int cpu; /* current CPU */
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count;/* pagefault_disable() levels */
struct restart_block restart_block;
unsigned int system_call;
__u64 user_timer;
diff --git a/arch/score/include/asm/thread_info.h b/arch/score/include/asm/thread_info.h
index 656b7ad..d7f748d 100644
--- a/arch/score/include/asm/thread_info.h
+++ b/arch/score/include/asm/thread_info.h
@@ -35,6 +35,7 @@ struct thread_info {

/* 0 => preemptable, < 0 => BUG */
int preempt_count;
+ int pagefault_count;/* pagefault_disable() levels */

/*
* thread address space:
diff --git a/arch/sh/include/asm/thread_info.h b/arch/sh/include/asm/thread_info.h
index ad27ffa..682a466 100644
--- a/arch/sh/include/asm/thread_info.h
+++ b/arch/sh/include/asm/thread_info.h
@@ -32,6 +32,7 @@ struct thread_info {
__u32 status; /* thread synchronous flags */
__u32 cpu;
int preempt_count; /* 0 => preemptable, <0 => BUG */
+ int pagefault_count;/* pagefault_disable() levels */
mm_segment_t addr_limit; /* thread address space */
struct restart_block restart_block;
unsigned long previous_sp; /* sp of previous stack in case
diff --git a/arch/sparc/include/asm/thread_info_32.h b/arch/sparc/include/asm/thread_info_32.h
index 025c984..ff0b112 100644
--- a/arch/sparc/include/asm/thread_info_32.h
+++ b/arch/sparc/include/asm/thread_info_32.h
@@ -49,6 +49,7 @@ struct thread_info {
unsigned long w_saved;

struct restart_block restart_block;
+ int pagefault_count;/* pagefault_disable() levels */
};

/*
diff --git a/arch/sparc/include/asm/thread_info_64.h b/arch/sparc/include/asm/thread_info_64.h
index 798f027..76a60ab 100644
--- a/arch/sparc/include/asm/thread_info_64.h
+++ b/arch/sparc/include/asm/thread_info_64.h
@@ -62,6 +62,7 @@ struct thread_info {

struct pt_regs *kern_una_regs;
unsigned int kern_una_insn;
+ int pagefault_count;/* pagefault_disable() levels */

unsigned long fpregs[(7 * 256) / sizeof(unsigned long)]
__attribute__ ((aligned(64)));
diff --git a/arch/tile/include/asm/thread_info.h b/arch/tile/include/asm/thread_info.h
index 48e4fd0..57032b6 100644
--- a/arch/tile/include/asm/thread_info.h
+++ b/arch/tile/include/asm/thread_info.h
@@ -33,6 +33,7 @@ struct thread_info {
__u32 cpu; /* current CPU */
int preempt_count; /* 0 => preemptable,
<0 => BUG */
+ int pagefault_count;/* pagefault_disable() levels */

mm_segment_t addr_limit; /* thread address space
(KERNEL_DS or USER_DS) */
diff --git a/arch/um/include/asm/thread_info.h b/arch/um/include/asm/thread_info.h
index 1c5b2a8..90b193c 100644
--- a/arch/um/include/asm/thread_info.h
+++ b/arch/um/include/asm/thread_info.h
@@ -19,6 +19,7 @@ struct thread_info {
__u32 cpu; /* current CPU */
int preempt_count; /* 0 => preemptable,
<0 => BUG */
+ int pagefault_count;/* pagefault_disable() levels */
mm_segment_t addr_limit; /* thread address space:
0-0xBFFFFFFF for user
0-0xFFFFFFFF for kernel */
diff --git a/arch/unicore32/include/asm/thread_info.h b/arch/unicore32/include/asm/thread_info.h
index af36d8e..1d50fb3 100644
--- a/arch/unicore32/include/asm/thread_info.h
+++ b/arch/unicore32/include/asm/thread_info.h
@@ -69,6 +69,7 @@ struct thread_info {
unsigned long flags; /* low level flags */
int preempt_count; /* 0 => preemptable */
/* <0 => bug */
+ int pagefault_count;/* pagefault_disable() levels */
mm_segment_t addr_limit; /* address limit */
struct task_struct *task; /* main task structure */
struct exec_domain *exec_domain; /* execution domain */
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 547e344..fa075ab 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -30,6 +30,7 @@ struct thread_info {
__u32 status; /* thread synchronous flags */
__u32 cpu; /* current CPU */
int saved_preempt_count;
+ int pagefault_count;/* pagefault_disable() levels */
mm_segment_t addr_limit;
struct restart_block restart_block;
void __user *sysenter_return;
diff --git a/arch/xtensa/include/asm/thread_info.h b/arch/xtensa/include/asm/thread_info.h
index 470153e..a866129 100644
--- a/arch/xtensa/include/asm/thread_info.h
+++ b/arch/xtensa/include/asm/thread_info.h
@@ -53,6 +53,7 @@ struct thread_info {
mm_segment_t addr_limit; /* thread address space */
struct restart_block restart_block;

+ __s32 pagefault_count;/* pagefault_disable() levels */
unsigned long cpenable;

/* Allocate storage for extra user states and coprocessor states. */
--
1.8.5.5

2014-12-10 14:25:03

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v2 2/5] uaccess: count pagefault_disable() levels in pagefault_count

pagefault_disable() and pagefault_enable() have to increment/decrement the
pagefault_count. We keep manipulating the preempt count to retain compability
to existing pagefault handlers. This has to be demangled in separate patches.

It is now possible to verify whether in a pagefault_disable() envionment by
calling pagefault_disabled().

Signed-off-by: David Hildenbrand <[email protected]>
---
include/linux/uaccess.h | 45 ++++++++++++++++++++++++++++++++++++++-------
1 file changed, 38 insertions(+), 7 deletions(-)

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index ecd3319..1dfc678 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -2,20 +2,45 @@
#define __LINUX_UACCESS_H__

#include <linux/preempt.h>
+#include <linux/thread_info.h>
#include <asm/uaccess.h>

+static __always_inline int pagefault_count(void)
+{
+ return current_thread_info()->pagefault_count;
+}
+
+static __always_inline int *pagefault_count_ptr(void)
+{
+ return &current_thread_info()->pagefault_count;
+}
+
+static __always_inline void pagefault_count_inc(void)
+{
+ (*pagefault_count_ptr())++;
+}
+
+static __always_inline void pagefault_count_dec(void)
+{
+ (*pagefault_count_ptr())--;
+}
+
/*
- * These routines enable/disable the pagefault handler in that
- * it will not take any locks and go straight to the fixup table.
+ * These routines enable/disable the pagefault handler. If disabled, it will
+ * not take any locks and go straight to the fixup table.
+ *
+ * We increase the preempt and the pagefault count, to be able to distinguish
+ * whether we run in simple atomic context or in a real pagefault_disable()
+ * context.
+ *
+ * For now, after pagefault_disabled() has been called, we run in atomic
+ * context. User access methods will not sleep.
*
- * They have great resemblance to the preempt_disable/enable calls
- * and in fact they are identical; this is because currently there is
- * no other way to make the pagefault handlers do this. So we do
- * disable preemption but we don't necessarily care about that.
*/
static inline void pagefault_disable(void)
{
preempt_count_inc();
+ pagefault_count_inc();
/*
* make sure to have issued the store before a pagefault
* can hit.
@@ -25,18 +50,24 @@ static inline void pagefault_disable(void)

static inline void pagefault_enable(void)
{
-#ifndef CONFIG_PREEMPT
/*
* make sure to issue those last loads/stores before enabling
* the pagefault handler again.
*/
barrier();
+ pagefault_count_dec();
+#ifndef CONFIG_PREEMPT
preempt_count_dec();
#else
preempt_enable();
#endif
}

+/*
+ * Is the pagefault handler disabled? If so, user access methods will not sleep.
+ */
+#define pagefault_disabled() (pagefault_count() != 0)
+
#ifndef ARCH_HAS_NOCACHE_UACCESS

static inline unsigned long __copy_from_user_inatomic_nocache(void *to,
--
1.8.5.5

2014-12-15 10:08:01

by Ley Foon Tan

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] uaccess: add pagefault_count to thread_info

On Wed, Dec 10, 2014 at 10:23 PM, David Hildenbrand
<[email protected]> wrote:
> This patch adds the pagefault_count to the thread_info of all
> architectures. It will be used to count the pagefault_disable() levels
> on a per-thread basis.
>
> We are not reusing the preempt_count as this is per cpu on x86 and we want to
> demangle pagefault_disable() from preemption in the future.
>
> The new counter is added directly below the preempt_count, except for archs
> relying on a manual calculation of asm offsets - to minimize the changes.
>
Hi David

Is this patchset targeted for 3.19? If yes, then we need this for
arch/nios2 as well (new arch in 3.19).

2014-12-15 12:48:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] uaccess: add pagefault_count to thread_info

On Mon, Dec 15, 2014 at 12:23:01PM +0100, David Hildenbrand wrote:
> Peter just showed my that there is some work ongoing on that
> pagefault_disable() topic. So I am not sure if the arch-specific part of this
> series is still relevant.

Well 'on-going' would be over stating it. We did those bits in -rt many
years ago, but never really made a strong enough effort to get them
upstream.

If there now is a 'semi' sane use-case for upstreaming them we should
indeed look at that.

2014-12-15 12:50:38

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()

On Mon, Dec 15, 2014 at 12:21:27PM +0100, David Hildenbrand wrote:
> > On Wed, Dec 10, 2014 at 03:23:29PM +0100, David Hildenbrand wrote:
> >
> > Did you look at the -rt patches where this comes from?
> >
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=b389ced19ab649438196d132768fe6522d2f052b
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=4fb7f9d416f7b34036d9d1b209e77c462ac0ee10
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=c730a4aade9e5c9b84f65de01d6612bf70c577e3
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=d365f5bf933e988a39874b33302f02ff6c7989b7
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=93eb18f43dfa5d49d99e2b8ad236eea2c35dd539
> > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=8947442e896921e1b645f9e1fd0f2beee103bba0
> >
> >
>
> Thanks for the links - haven't seen these patches so far (somebody on the list
> just mentioned that someone tried to demangle that stuff some time ago). But
> good to know that somebody is working on that pagefault_disable() thing.
>
> Do you know what the plans for this series are? So I can base my patches
> (might_sleep() checks for might_fault()) on that queue.

As stated in that other email, there's no active work on this atm. Its
just what -rt needed the pagefault_{en,dis}able() bits for. I think we
should try and merge some of that upstream now that there is a stronger
use case.

2014-12-15 13:08:19

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()

> On Mon, Dec 15, 2014 at 12:21:27PM +0100, David Hildenbrand wrote:
> > > On Wed, Dec 10, 2014 at 03:23:29PM +0100, David Hildenbrand wrote:
> > >
> > > Did you look at the -rt patches where this comes from?
> > >
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=b389ced19ab649438196d132768fe6522d2f052b
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=4fb7f9d416f7b34036d9d1b209e77c462ac0ee10
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=c730a4aade9e5c9b84f65de01d6612bf70c577e3
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=d365f5bf933e988a39874b33302f02ff6c7989b7
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=93eb18f43dfa5d49d99e2b8ad236eea2c35dd539
> > > https://git.kernel.org/cgit/linux/kernel/git/clrkwllms/rt-linux.git/commit/?h=v3.14.21-rt9&id=8947442e896921e1b645f9e1fd0f2beee103bba0
> > >
> > >
> >
> > Thanks for the links - haven't seen these patches so far (somebody on the list
> > just mentioned that someone tried to demangle that stuff some time ago). But
> > good to know that somebody is working on that pagefault_disable() thing.
> >
> > Do you know what the plans for this series are? So I can base my patches
> > (might_sleep() checks for might_fault()) on that queue.
>
> As stated in that other email, there's no active work on this atm. Its
> just what -rt needed the pagefault_{en,dis}able() bits for. I think we
> should try and merge some of that upstream now that there is a stronger
> use case.
>

Ah, now I get it. So the main question is which approach is better:

a) -rt version: Store the pagefault_count in struct task_struct()
b) my version: Storing it in thread_info

IOW: My series first and the -rt part (pagefault handlers, preempt fixup) on
top or -rt version first and my work (patch 3 + 4 ) on top.

Getting rid of that whole preemption handling in pagefault_disable() / fixing up
the pagefault handlers is something I would have addressed in future patches,
but that part seems to be just fine in the -rt code.

Thanks for having a look!

David

2014-12-15 16:42:41

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] uaccess: add pagefault_count to thread_info

> On Wed, Dec 10, 2014 at 10:23 PM, David Hildenbrand
> <[email protected]> wrote:
> > This patch adds the pagefault_count to the thread_info of all
> > architectures. It will be used to count the pagefault_disable() levels
> > on a per-thread basis.
> >
> > We are not reusing the preempt_count as this is per cpu on x86 and we want to
> > demangle pagefault_disable() from preemption in the future.
> >
> > The new counter is added directly below the preempt_count, except for archs
> > relying on a manual calculation of asm offsets - to minimize the changes.
> >
> Hi David
>
> Is this patchset targeted for 3.19? If yes, then we need this for
> arch/nios2 as well (new arch in 3.19).
>

Hi,

Peter just showed my that there is some work ongoing on that
pagefault_disable() topic. So I am not sure if the arch-specific part of this
series is still relevant.

But thanks for the info!

David

2015-02-09 14:42:28

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()

On Mon, Jan 12, 2015 at 03:19:11PM +0100, David Hildenbrand wrote:
> Thomas, Peter,
>
> anything that speaks against putting the pagefault_disable counter into
> thread_info (my series) instead of task_struct (rt tree)?
>
> IOW, what would be the right place for it?

I think we put it in task_struct because lazy; ARM seems one of the few
popular archs where current still goes through thread_info.

And that I think is the only reason to maybe use thread_info, cost of
access. The down-side of using thread_info is of course that it reduces
stack size.

In any case; I think that if you want to go do this; please consider the
route -rt took and completely separate the two, don't leave the
preempt_count_{inc,dec} remnant in pagefault_{en,dis}able() at all.

2015-02-19 14:48:16

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()

> On Mon, Jan 12, 2015 at 03:19:11PM +0100, David Hildenbrand wrote:
> > Thomas, Peter,
> >
> > anything that speaks against putting the pagefault_disable counter into
> > thread_info (my series) instead of task_struct (rt tree)?
> >
> > IOW, what would be the right place for it?
>
> I think we put it in task_struct because lazy; ARM seems one of the few
> popular archs where current still goes through thread_info.
>
> And that I think is the only reason to maybe use thread_info, cost of
> access. The down-side of using thread_info is of course that it reduces
> stack size.
>
> In any case; I think that if you want to go do this; please consider the
> route -rt took and completely separate the two, don't leave the
> preempt_count_{inc,dec} remnant in pagefault_{en,dis}able() at all.
>
>

Thanks Peter,

I am currently preparing/testing a series that does the requested separation
(getting rid of preempt_count_{inc,dec} ...) while putting the pagefault disable
count into task_info.

Downside is that now that I have to touch all fault handlers, I have to go
through all archs again.

Think I'll have something to show in a couple of days.

David

2015-02-19 15:07:36

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()

On Thu, Feb 19, 2015 at 03:48:05PM +0100, David Hildenbrand wrote:
> Downside is that now that I have to touch all fault handlers, I have to go
> through all archs again.

You should be able to borrow from the -rt patches there. They have all
that.

2015-02-19 15:14:53

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] Reenable might_sleep() checks for might_fault()

> On Thu, Feb 19, 2015 at 03:48:05PM +0100, David Hildenbrand wrote:
> > Downside is that now that I have to touch all fault handlers, I have to go
> > through all archs again.
>
> You should be able to borrow from the -rt patches there. They have all
> that.
>

Jup, that's what I partially did.

Thanks

David