2019-11-21 12:00:39

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 00/10] Rework REFCOUNT_FULL using atomic_fetch_* operations

Hi everybody,

This is a resend of version four of the patches I posted here:

v4: https://lore.kernel.org/lkml/[email protected]

Previous versions can be found at:

v1: https://lkml.kernel.org/r/[email protected]
v2: https://lkml.kernel.org/r/[email protected]
v3: https://lkml.kernel.org/r/[email protected]

I didn't receive any feedback last time around, other than some positive
noises from Kees, so please consider this for inclusion in mainline.

Thanks,

Will

Cc: Kees Cook <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Hanjun Guo <[email protected]>

--->8

Will Deacon (10):
lib/refcount: Define constants for saturation and max refcount values
lib/refcount: Ensure integer operands are treated as signed
lib/refcount: Remove unused refcount_*_checked() variants
lib/refcount: Move bulk of REFCOUNT_FULL implementation into header
lib/refcount: Improve performance of generic REFCOUNT_FULL code
lib/refcount: Move saturation warnings out of line
lib/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions
refcount: Consolidate implementations of refcount_t
lib/refcount: Remove unused 'refcount_error_report()' function
drivers/lkdtm: Remove references to CONFIG_REFCOUNT_FULL

arch/Kconfig | 21 ---
arch/arm/Kconfig | 1 -
arch/arm64/Kconfig | 1 -
arch/s390/configs/debug_defconfig | 1 -
arch/x86/Kconfig | 1 -
arch/x86/include/asm/asm.h | 6 -
arch/x86/include/asm/refcount.h | 126 --------------
arch/x86/mm/extable.c | 49 ------
drivers/gpu/drm/i915/Kconfig.debug | 1 -
drivers/misc/lkdtm/refcount.c | 11 +-
include/linux/kernel.h | 7 -
include/linux/refcount.h | 269 ++++++++++++++++++++++++-----
kernel/panic.c | 11 --
lib/refcount.c | 255 +++------------------------
14 files changed, 257 insertions(+), 503 deletions(-)
delete mode 100644 arch/x86/include/asm/refcount.h

--
2.24.0.432.g9d3f5f5b63-goog


2019-11-21 12:00:43

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 02/10] lib/refcount: Ensure integer operands are treated as signed

In preparation for changing the saturation point of REFCOUNT_FULL to
INT_MIN / 2, change the type of integer operands passed into the API
from 'unsigned int' to 'int' so that we can avoid casting during
comparisons when we don't want to fall foul of C integral conversion
rules for signed and unsigned types.

Since the kernel is compiled with '-fno-strict-overflow', we don't need
to worry about the UB introduced by signed overflow here. Furthermore,
we're already making heavy use of the atomic_t API, which operates
exclusively on signed types.

Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/refcount.h | 14 +++++++-------
lib/refcount.c | 6 +++---
2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 79f62e8d2256..89066a1471dd 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -28,7 +28,7 @@ typedef struct refcount_struct {
* @r: the refcount
* @n: value to which the refcount will be set
*/
-static inline void refcount_set(refcount_t *r, unsigned int n)
+static inline void refcount_set(refcount_t *r, int n)
{
atomic_set(&r->refs, n);
}
@@ -44,13 +44,13 @@ static inline unsigned int refcount_read(const refcount_t *r)
return atomic_read(&r->refs);
}

-extern __must_check bool refcount_add_not_zero_checked(unsigned int i, refcount_t *r);
-extern void refcount_add_checked(unsigned int i, refcount_t *r);
+extern __must_check bool refcount_add_not_zero_checked(int i, refcount_t *r);
+extern void refcount_add_checked(int i, refcount_t *r);

extern __must_check bool refcount_inc_not_zero_checked(refcount_t *r);
extern void refcount_inc_checked(refcount_t *r);

-extern __must_check bool refcount_sub_and_test_checked(unsigned int i, refcount_t *r);
+extern __must_check bool refcount_sub_and_test_checked(int i, refcount_t *r);

extern __must_check bool refcount_dec_and_test_checked(refcount_t *r);
extern void refcount_dec_checked(refcount_t *r);
@@ -79,12 +79,12 @@ extern void refcount_dec_checked(refcount_t *r);
# ifdef CONFIG_ARCH_HAS_REFCOUNT
# include <asm/refcount.h>
# else
-static inline __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
+static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
{
return atomic_add_unless(&r->refs, i, 0);
}

-static inline void refcount_add(unsigned int i, refcount_t *r)
+static inline void refcount_add(int i, refcount_t *r)
{
atomic_add(i, &r->refs);
}
@@ -99,7 +99,7 @@ static inline void refcount_inc(refcount_t *r)
atomic_inc(&r->refs);
}

-static inline __must_check bool refcount_sub_and_test(unsigned int i, refcount_t *r)
+static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
{
return atomic_sub_and_test(i, &r->refs);
}
diff --git a/lib/refcount.c b/lib/refcount.c
index 48b78a423d7d..719b0bc42ab1 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -61,7 +61,7 @@
*
* Return: false if the passed refcount is 0, true otherwise
*/
-bool refcount_add_not_zero_checked(unsigned int i, refcount_t *r)
+bool refcount_add_not_zero_checked(int i, refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

@@ -101,7 +101,7 @@ EXPORT_SYMBOL(refcount_add_not_zero_checked);
* cases, refcount_inc(), or one of its variants, should instead be used to
* increment a reference count.
*/
-void refcount_add_checked(unsigned int i, refcount_t *r)
+void refcount_add_checked(int i, refcount_t *r)
{
WARN_ONCE(!refcount_add_not_zero_checked(i, r), "refcount_t: addition on 0; use-after-free.\n");
}
@@ -180,7 +180,7 @@ EXPORT_SYMBOL(refcount_inc_checked);
*
* Return: true if the resulting refcount is 0, false otherwise
*/
-bool refcount_sub_and_test_checked(unsigned int i, refcount_t *r)
+bool refcount_sub_and_test_checked(int i, refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

--
2.24.0.432.g9d3f5f5b63-goog

2019-11-21 12:00:55

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 04/10] lib/refcount: Move bulk of REFCOUNT_FULL implementation into header

In an effort to improve performance of the REFCOUNT_FULL implementation,
move the bulk of its functions into linux/refcount.h. This allows them
to be inlined in the same way as if they had been provided via
CONFIG_ARCH_HAS_REFCOUNT.

Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/refcount.h | 237 ++++++++++++++++++++++++++++++++++++--
lib/refcount.c | 238 +--------------------------------------
2 files changed, 229 insertions(+), 246 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index edd505d1a23b..e719b5b1220e 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -45,22 +45,241 @@ static inline unsigned int refcount_read(const refcount_t *r)
}

#ifdef CONFIG_REFCOUNT_FULL
+#include <linux/bug.h>

#define REFCOUNT_MAX (UINT_MAX - 1)
#define REFCOUNT_SATURATED UINT_MAX

-extern __must_check bool refcount_add_not_zero(int i, refcount_t *r);
-extern void refcount_add(int i, refcount_t *r);
+/*
+ * Variant of atomic_t specialized for reference counts.
+ *
+ * The interface matches the atomic_t interface (to aid in porting) but only
+ * provides the few functions one should use for reference counting.
+ *
+ * It differs in that the counter saturates at REFCOUNT_SATURATED and will not
+ * move once there. This avoids wrapping the counter and causing 'spurious'
+ * use-after-free issues.
+ *
+ * Memory ordering rules are slightly relaxed wrt regular atomic_t functions
+ * and provide only what is strictly required for refcounts.
+ *
+ * The increments are fully relaxed; these will not provide ordering. The
+ * rationale is that whatever is used to obtain the object we're increasing the
+ * reference count on will provide the ordering. For locked data structures,
+ * its the lock acquire, for RCU/lockless data structures its the dependent
+ * load.
+ *
+ * Do note that inc_not_zero() provides a control dependency which will order
+ * future stores against the inc, this ensures we'll never modify the object
+ * if we did not in fact acquire a reference.
+ *
+ * The decrements will provide release order, such that all the prior loads and
+ * stores will be issued before, it also provides a control dependency, which
+ * will order us against the subsequent free().
+ *
+ * The control dependency is against the load of the cmpxchg (ll/sc) that
+ * succeeded. This means the stores aren't fully ordered, but this is fine
+ * because the 1->0 transition indicates no concurrency.
+ *
+ * Note that the allocator is responsible for ordering things between free()
+ * and alloc().
+ *
+ * The decrements dec_and_test() and sub_and_test() also provide acquire
+ * ordering on success.
+ *
+ */
+
+/**
+ * refcount_add_not_zero - add a value to a refcount unless it is 0
+ * @i: the value to add to the refcount
+ * @r: the refcount
+ *
+ * Will saturate at REFCOUNT_SATURATED and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller has guaranteed the
+ * object memory to be stable (RCU, etc.). It does provide a control dependency
+ * and thereby orders future stores. See the comment on top.
+ *
+ * Use of this function is not recommended for the normal reference counting
+ * use case in which references are taken and released one at a time. In these
+ * cases, refcount_inc(), or one of its variants, should instead be used to
+ * increment a reference count.
+ *
+ * Return: false if the passed refcount is 0, true otherwise
+ */
+static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
+{
+ unsigned int new, val = atomic_read(&r->refs);
+
+ do {
+ if (!val)
+ return false;
+
+ if (unlikely(val == REFCOUNT_SATURATED))
+ return true;
+
+ new = val + i;
+ if (new < val)
+ new = REFCOUNT_SATURATED;
+
+ } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
+
+ WARN_ONCE(new == REFCOUNT_SATURATED,
+ "refcount_t: saturated; leaking memory.\n");
+
+ return true;
+}
+
+/**
+ * refcount_add - add a value to a refcount
+ * @i: the value to add to the refcount
+ * @r: the refcount
+ *
+ * Similar to atomic_add(), but will saturate at REFCOUNT_SATURATED and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller has guaranteed the
+ * object memory to be stable (RCU, etc.). It does provide a control dependency
+ * and thereby orders future stores. See the comment on top.
+ *
+ * Use of this function is not recommended for the normal reference counting
+ * use case in which references are taken and released one at a time. In these
+ * cases, refcount_inc(), or one of its variants, should instead be used to
+ * increment a reference count.
+ */
+static inline void refcount_add(int i, refcount_t *r)
+{
+ WARN_ONCE(!refcount_add_not_zero(i, r), "refcount_t: addition on 0; use-after-free.\n");
+}
+
+/**
+ * refcount_inc_not_zero - increment a refcount unless it is 0
+ * @r: the refcount to increment
+ *
+ * Similar to atomic_inc_not_zero(), but will saturate at REFCOUNT_SATURATED
+ * and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller has guaranteed the
+ * object memory to be stable (RCU, etc.). It does provide a control dependency
+ * and thereby orders future stores. See the comment on top.
+ *
+ * Return: true if the increment was successful, false otherwise
+ */
+static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
+{
+ unsigned int new, val = atomic_read(&r->refs);
+
+ do {
+ new = val + 1;

-extern __must_check bool refcount_inc_not_zero(refcount_t *r);
-extern void refcount_inc(refcount_t *r);
+ if (!val)
+ return false;

-extern __must_check bool refcount_sub_and_test(int i, refcount_t *r);
+ if (unlikely(!new))
+ return true;

-extern __must_check bool refcount_dec_and_test(refcount_t *r);
-extern void refcount_dec(refcount_t *r);
+ } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
+
+ WARN_ONCE(new == REFCOUNT_SATURATED,
+ "refcount_t: saturated; leaking memory.\n");
+
+ return true;
+}
+
+/**
+ * refcount_inc - increment a refcount
+ * @r: the refcount to increment
+ *
+ * Similar to atomic_inc(), but will saturate at REFCOUNT_SATURATED and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller already has a
+ * reference on the object.
+ *
+ * Will WARN if the refcount is 0, as this represents a possible use-after-free
+ * condition.
+ */
+static inline void refcount_inc(refcount_t *r)
+{
+ WARN_ONCE(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
+}
+
+/**
+ * refcount_sub_and_test - subtract from a refcount and test if it is 0
+ * @i: amount to subtract from the refcount
+ * @r: the refcount
+ *
+ * Similar to atomic_dec_and_test(), but it will WARN, return false and
+ * ultimately leak on underflow and will fail to decrement when saturated
+ * at REFCOUNT_SATURATED.
+ *
+ * Provides release memory ordering, such that prior loads and stores are done
+ * before, and provides an acquire ordering on success such that free()
+ * must come after.
+ *
+ * Use of this function is not recommended for the normal reference counting
+ * use case in which references are taken and released one at a time. In these
+ * cases, refcount_dec(), or one of its variants, should instead be used to
+ * decrement a reference count.
+ *
+ * Return: true if the resulting refcount is 0, false otherwise
+ */
+static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
+{
+ unsigned int new, val = atomic_read(&r->refs);
+
+ do {
+ if (unlikely(val == REFCOUNT_SATURATED))
+ return false;
+
+ new = val - i;
+ if (new > val) {
+ WARN_ONCE(new > val, "refcount_t: underflow; use-after-free.\n");
+ return false;
+ }
+
+ } while (!atomic_try_cmpxchg_release(&r->refs, &val, new));
+
+ if (!new) {
+ smp_acquire__after_ctrl_dep();
+ return true;
+ }
+ return false;
+
+}
+
+/**
+ * refcount_dec_and_test - decrement a refcount and test if it is 0
+ * @r: the refcount
+ *
+ * Similar to atomic_dec_and_test(), it will WARN on underflow and fail to
+ * decrement when saturated at REFCOUNT_SATURATED.
+ *
+ * Provides release memory ordering, such that prior loads and stores are done
+ * before, and provides an acquire ordering on success such that free()
+ * must come after.
+ *
+ * Return: true if the resulting refcount is 0, false otherwise
+ */
+static inline __must_check bool refcount_dec_and_test(refcount_t *r)
+{
+ return refcount_sub_and_test(1, r);
+}
+
+/**
+ * refcount_dec - decrement a refcount
+ * @r: the refcount
+ *
+ * Similar to atomic_dec(), it will WARN on underflow and fail to decrement
+ * when saturated at REFCOUNT_SATURATED.
+ *
+ * Provides release memory ordering, such that prior loads and stores are done
+ * before.
+ */
+static inline void refcount_dec(refcount_t *r)
+{
+ WARN_ONCE(refcount_dec_and_test(r), "refcount_t: decrement hit 0; leaking memory.\n");
+}

-#else
+#else /* CONFIG_REFCOUNT_FULL */

#define REFCOUNT_MAX INT_MAX
#define REFCOUNT_SATURATED (INT_MIN / 2)
@@ -103,7 +322,7 @@ static inline void refcount_dec(refcount_t *r)
atomic_dec(&r->refs);
}
# endif /* !CONFIG_ARCH_HAS_REFCOUNT */
-#endif /* CONFIG_REFCOUNT_FULL */
+#endif /* !CONFIG_REFCOUNT_FULL */

extern __must_check bool refcount_dec_if_one(refcount_t *r);
extern __must_check bool refcount_dec_not_one(refcount_t *r);
diff --git a/lib/refcount.c b/lib/refcount.c
index a2f670998cee..3a534fbebdcc 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -1,41 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
/*
- * Variant of atomic_t specialized for reference counts.
- *
- * The interface matches the atomic_t interface (to aid in porting) but only
- * provides the few functions one should use for reference counting.
- *
- * It differs in that the counter saturates at REFCOUNT_SATURATED and will not
- * move once there. This avoids wrapping the counter and causing 'spurious'
- * use-after-free issues.
- *
- * Memory ordering rules are slightly relaxed wrt regular atomic_t functions
- * and provide only what is strictly required for refcounts.
- *
- * The increments are fully relaxed; these will not provide ordering. The
- * rationale is that whatever is used to obtain the object we're increasing the
- * reference count on will provide the ordering. For locked data structures,
- * its the lock acquire, for RCU/lockless data structures its the dependent
- * load.
- *
- * Do note that inc_not_zero() provides a control dependency which will order
- * future stores against the inc, this ensures we'll never modify the object
- * if we did not in fact acquire a reference.
- *
- * The decrements will provide release order, such that all the prior loads and
- * stores will be issued before, it also provides a control dependency, which
- * will order us against the subsequent free().
- *
- * The control dependency is against the load of the cmpxchg (ll/sc) that
- * succeeded. This means the stores aren't fully ordered, but this is fine
- * because the 1->0 transition indicates no concurrency.
- *
- * Note that the allocator is responsible for ordering things between free()
- * and alloc().
- *
- * The decrements dec_and_test() and sub_and_test() also provide acquire
- * ordering on success.
- *
+ * Out-of-line refcount functions common to all refcount implementations.
*/

#include <linux/mutex.h>
@@ -43,207 +8,6 @@
#include <linux/spinlock.h>
#include <linux/bug.h>

-#ifdef CONFIG_REFCOUNT_FULL
-
-/**
- * refcount_add_not_zero - add a value to a refcount unless it is 0
- * @i: the value to add to the refcount
- * @r: the refcount
- *
- * Will saturate at REFCOUNT_SATURATED and WARN.
- *
- * Provides no memory ordering, it is assumed the caller has guaranteed the
- * object memory to be stable (RCU, etc.). It does provide a control dependency
- * and thereby orders future stores. See the comment on top.
- *
- * Use of this function is not recommended for the normal reference counting
- * use case in which references are taken and released one at a time. In these
- * cases, refcount_inc(), or one of its variants, should instead be used to
- * increment a reference count.
- *
- * Return: false if the passed refcount is 0, true otherwise
- */
-bool refcount_add_not_zero(int i, refcount_t *r)
-{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- if (!val)
- return false;
-
- if (unlikely(val == REFCOUNT_SATURATED))
- return true;
-
- new = val + i;
- if (new < val)
- new = REFCOUNT_SATURATED;
-
- } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
-
- WARN_ONCE(new == REFCOUNT_SATURATED,
- "refcount_t: saturated; leaking memory.\n");
-
- return true;
-}
-EXPORT_SYMBOL(refcount_add_not_zero);
-
-/**
- * refcount_add - add a value to a refcount
- * @i: the value to add to the refcount
- * @r: the refcount
- *
- * Similar to atomic_add(), but will saturate at REFCOUNT_SATURATED and WARN.
- *
- * Provides no memory ordering, it is assumed the caller has guaranteed the
- * object memory to be stable (RCU, etc.). It does provide a control dependency
- * and thereby orders future stores. See the comment on top.
- *
- * Use of this function is not recommended for the normal reference counting
- * use case in which references are taken and released one at a time. In these
- * cases, refcount_inc(), or one of its variants, should instead be used to
- * increment a reference count.
- */
-void refcount_add(int i, refcount_t *r)
-{
- WARN_ONCE(!refcount_add_not_zero(i, r), "refcount_t: addition on 0; use-after-free.\n");
-}
-EXPORT_SYMBOL(refcount_add);
-
-/**
- * refcount_inc_not_zero - increment a refcount unless it is 0
- * @r: the refcount to increment
- *
- * Similar to atomic_inc_not_zero(), but will saturate at REFCOUNT_SATURATED
- * and WARN.
- *
- * Provides no memory ordering, it is assumed the caller has guaranteed the
- * object memory to be stable (RCU, etc.). It does provide a control dependency
- * and thereby orders future stores. See the comment on top.
- *
- * Return: true if the increment was successful, false otherwise
- */
-bool refcount_inc_not_zero(refcount_t *r)
-{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- new = val + 1;
-
- if (!val)
- return false;
-
- if (unlikely(!new))
- return true;
-
- } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
-
- WARN_ONCE(new == REFCOUNT_SATURATED,
- "refcount_t: saturated; leaking memory.\n");
-
- return true;
-}
-EXPORT_SYMBOL(refcount_inc_not_zero);
-
-/**
- * refcount_inc - increment a refcount
- * @r: the refcount to increment
- *
- * Similar to atomic_inc(), but will saturate at REFCOUNT_SATURATED and WARN.
- *
- * Provides no memory ordering, it is assumed the caller already has a
- * reference on the object.
- *
- * Will WARN if the refcount is 0, as this represents a possible use-after-free
- * condition.
- */
-void refcount_inc(refcount_t *r)
-{
- WARN_ONCE(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
-}
-EXPORT_SYMBOL(refcount_inc);
-
-/**
- * refcount_sub_and_test - subtract from a refcount and test if it is 0
- * @i: amount to subtract from the refcount
- * @r: the refcount
- *
- * Similar to atomic_dec_and_test(), but it will WARN, return false and
- * ultimately leak on underflow and will fail to decrement when saturated
- * at REFCOUNT_SATURATED.
- *
- * Provides release memory ordering, such that prior loads and stores are done
- * before, and provides an acquire ordering on success such that free()
- * must come after.
- *
- * Use of this function is not recommended for the normal reference counting
- * use case in which references are taken and released one at a time. In these
- * cases, refcount_dec(), or one of its variants, should instead be used to
- * decrement a reference count.
- *
- * Return: true if the resulting refcount is 0, false otherwise
- */
-bool refcount_sub_and_test(int i, refcount_t *r)
-{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- if (unlikely(val == REFCOUNT_SATURATED))
- return false;
-
- new = val - i;
- if (new > val) {
- WARN_ONCE(new > val, "refcount_t: underflow; use-after-free.\n");
- return false;
- }
-
- } while (!atomic_try_cmpxchg_release(&r->refs, &val, new));
-
- if (!new) {
- smp_acquire__after_ctrl_dep();
- return true;
- }
- return false;
-
-}
-EXPORT_SYMBOL(refcount_sub_and_test);
-
-/**
- * refcount_dec_and_test - decrement a refcount and test if it is 0
- * @r: the refcount
- *
- * Similar to atomic_dec_and_test(), it will WARN on underflow and fail to
- * decrement when saturated at REFCOUNT_SATURATED.
- *
- * Provides release memory ordering, such that prior loads and stores are done
- * before, and provides an acquire ordering on success such that free()
- * must come after.
- *
- * Return: true if the resulting refcount is 0, false otherwise
- */
-bool refcount_dec_and_test(refcount_t *r)
-{
- return refcount_sub_and_test(1, r);
-}
-EXPORT_SYMBOL(refcount_dec_and_test);
-
-/**
- * refcount_dec - decrement a refcount
- * @r: the refcount
- *
- * Similar to atomic_dec(), it will WARN on underflow and fail to decrement
- * when saturated at REFCOUNT_SATURATED.
- *
- * Provides release memory ordering, such that prior loads and stores are done
- * before.
- */
-void refcount_dec(refcount_t *r)
-{
- WARN_ONCE(refcount_dec_and_test(r), "refcount_t: decrement hit 0; leaking memory.\n");
-}
-EXPORT_SYMBOL(refcount_dec);
-
-#endif /* CONFIG_REFCOUNT_FULL */
-
/**
* refcount_dec_if_one - decrement a refcount if it is 1
* @r: the refcount
--
2.24.0.432.g9d3f5f5b63-goog

2019-11-21 12:00:58

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 06/10] lib/refcount: Move saturation warnings out of line

Having the refcount saturation and warnings inline bloats the text,
despite the fact that these paths should never be executed in normal
operation.

Move the refcount saturation and warnings out of line to reduce the
image size when refcount_t checking is enabled. Relative to an x86_64
defconfig, the sizes reported by bloat-o-meter are:

# defconfig+REFCOUNT_FULL, inline saturation (i.e. before this patch)
Total: Before=14762076, After=14915442, chg +1.04%

# defconfig+REFCOUNT_FULL, out-of-line saturation (i.e. after this patch)
Total: Before=14762076, After=14835497, chg +0.50%

A side-effect of this change is that we now only get one warning per
refcount saturation type, rather than one per problematic call-site.

Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/refcount.h | 39 ++++++++++++++++++++-------------------
lib/refcount.c | 28 ++++++++++++++++++++++++++++
2 files changed, 48 insertions(+), 19 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index e3b218d669ce..1cd0a876a789 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -23,6 +23,16 @@ typedef struct refcount_struct {

#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }

+enum refcount_saturation_type {
+ REFCOUNT_ADD_NOT_ZERO_OVF,
+ REFCOUNT_ADD_OVF,
+ REFCOUNT_ADD_UAF,
+ REFCOUNT_SUB_UAF,
+ REFCOUNT_DEC_LEAK,
+};
+
+void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t);
+
/**
* refcount_set - set a refcount's value
* @r: the refcount
@@ -154,10 +164,8 @@ static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
break;
} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));

- if (unlikely(old < 0 || old + i < 0)) {
- refcount_set(r, REFCOUNT_SATURATED);
- WARN_ONCE(1, "refcount_t: saturated; leaking memory.\n");
- }
+ if (unlikely(old < 0 || old + i < 0))
+ refcount_warn_saturate(r, REFCOUNT_ADD_NOT_ZERO_OVF);

return old;
}
@@ -182,11 +190,10 @@ static inline void refcount_add(int i, refcount_t *r)
{
int old = atomic_fetch_add_relaxed(i, &r->refs);

- WARN_ONCE(!old, "refcount_t: addition on 0; use-after-free.\n");
- if (unlikely(old <= 0 || old + i <= 0)) {
- refcount_set(r, REFCOUNT_SATURATED);
- WARN_ONCE(old, "refcount_t: saturated; leaking memory.\n");
- }
+ if (unlikely(!old))
+ refcount_warn_saturate(r, REFCOUNT_ADD_UAF);
+ else if (unlikely(old < 0 || old + i < 0))
+ refcount_warn_saturate(r, REFCOUNT_ADD_OVF);
}

/**
@@ -253,10 +260,8 @@ static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
return true;
}

- if (unlikely(old < 0 || old - i < 0)) {
- refcount_set(r, REFCOUNT_SATURATED);
- WARN_ONCE(1, "refcount_t: underflow; use-after-free.\n");
- }
+ if (unlikely(old < 0 || old - i < 0))
+ refcount_warn_saturate(r, REFCOUNT_SUB_UAF);

return false;
}
@@ -291,12 +296,8 @@ static inline __must_check bool refcount_dec_and_test(refcount_t *r)
*/
static inline void refcount_dec(refcount_t *r)
{
- int old = atomic_fetch_sub_release(1, &r->refs);
-
- if (unlikely(old <= 1)) {
- refcount_set(r, REFCOUNT_SATURATED);
- WARN_ONCE(1, "refcount_t: decrement hit 0; leaking memory.\n");
- }
+ if (unlikely(atomic_fetch_sub_release(1, &r->refs) <= 1))
+ refcount_warn_saturate(r, REFCOUNT_DEC_LEAK);
}
#else /* CONFIG_REFCOUNT_FULL */

diff --git a/lib/refcount.c b/lib/refcount.c
index 3a534fbebdcc..8b7e249c0e10 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -8,6 +8,34 @@
#include <linux/spinlock.h>
#include <linux/bug.h>

+#define REFCOUNT_WARN(str) WARN_ONCE(1, "refcount_t: " str ".\n")
+
+void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t)
+{
+ refcount_set(r, REFCOUNT_SATURATED);
+
+ switch (t) {
+ case REFCOUNT_ADD_NOT_ZERO_OVF:
+ REFCOUNT_WARN("saturated; leaking memory");
+ break;
+ case REFCOUNT_ADD_OVF:
+ REFCOUNT_WARN("saturated; leaking memory");
+ break;
+ case REFCOUNT_ADD_UAF:
+ REFCOUNT_WARN("addition on 0; use-after-free");
+ break;
+ case REFCOUNT_SUB_UAF:
+ REFCOUNT_WARN("underflow; use-after-free");
+ break;
+ case REFCOUNT_DEC_LEAK:
+ REFCOUNT_WARN("decrement hit 0; leaking memory");
+ break;
+ default:
+ REFCOUNT_WARN("unknown saturation event!?");
+ }
+}
+EXPORT_SYMBOL(refcount_warn_saturate);
+
/**
* refcount_dec_if_one - decrement a refcount if it is 1
* @r: the refcount
--
2.24.0.432.g9d3f5f5b63-goog

2019-11-21 12:01:07

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 08/10] refcount: Consolidate implementations of refcount_t

The generic implementation of refcount_t should be good enough for
everybody, so remove ARCH_HAS_REFCOUNT and REFCOUNT_FULL entirely,
leaving the generic implementation enabled unconditionally.

Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Kees Cook <[email protected]>
Acked-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/Kconfig | 21 ----
arch/arm/Kconfig | 1 -
arch/arm64/Kconfig | 1 -
arch/s390/configs/debug_defconfig | 1 -
arch/x86/Kconfig | 1 -
arch/x86/include/asm/asm.h | 6 --
arch/x86/include/asm/refcount.h | 126 -----------------------
arch/x86/mm/extable.c | 49 ---------
drivers/gpu/drm/i915/Kconfig.debug | 1 -
include/linux/refcount.h | 158 +++++++++++------------------
lib/refcount.c | 2 +-
11 files changed, 59 insertions(+), 308 deletions(-)
delete mode 100644 arch/x86/include/asm/refcount.h

diff --git a/arch/Kconfig b/arch/Kconfig
index 5f8a5d84dbbe..8bcc1c746142 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -892,27 +892,6 @@ config STRICT_MODULE_RWX
config ARCH_HAS_PHYS_TO_DMA
bool

-config ARCH_HAS_REFCOUNT
- bool
- help
- An architecture selects this when it has implemented refcount_t
- using open coded assembly primitives that provide an optimized
- refcount_t implementation, possibly at the expense of some full
- refcount state checks of CONFIG_REFCOUNT_FULL=y.
-
- The refcount overflow check behavior, however, must be retained.
- Catching overflows is the primary security concern for protecting
- against bugs in reference counts.
-
-config REFCOUNT_FULL
- bool "Perform full reference count validation at the expense of speed"
- help
- Enabling this switches the refcounting infrastructure from a fast
- unchecked atomic_t implementation to a fully state checked
- implementation, which can be (slightly) slower but provides protections
- against various use-after-free conditions that can be used in
- security flaw exploits.
-
config HAVE_ARCH_COMPILER_H
bool
help
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 8a50efb559f3..0d3c5d7cceb7 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -117,7 +117,6 @@ config ARM
select OLD_SIGSUSPEND3
select PCI_SYSCALL if PCI
select PERF_USE_VMALLOC
- select REFCOUNT_FULL
select RTC_LIB
select SYS_SUPPORTS_APM_EMULATION
# Above selects are sorted alphabetically; please add new ones
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 41a9b4257b72..bc990d3abfe9 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -181,7 +181,6 @@ config ARM64
select PCI_SYSCALL if PCI
select POWER_RESET
select POWER_SUPPLY
- select REFCOUNT_FULL
select SPARSE_IRQ
select SWIOTLB
select SYSCTL_EXCEPTION_TRACE
diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig
index 38d64030aacf..2e60c80395ab 100644
--- a/arch/s390/configs/debug_defconfig
+++ b/arch/s390/configs/debug_defconfig
@@ -62,7 +62,6 @@ CONFIG_OPROFILE=m
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
CONFIG_STATIC_KEYS_SELFTEST=y
-CONFIG_REFCOUNT_FULL=y
CONFIG_LOCK_EVENT_COUNTS=y
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d6e1faa28c58..fa6274f1e370 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -73,7 +73,6 @@ config X86
select ARCH_HAS_PMEM_API if X86_64
select ARCH_HAS_PTE_DEVMAP if X86_64
select ARCH_HAS_PTE_SPECIAL
- select ARCH_HAS_REFCOUNT
select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64
select ARCH_HAS_UACCESS_MCSAFE if X86_64 && X86_MCE
select ARCH_HAS_SET_MEMORY
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 3ff577c0b102..5a0c14ebef70 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -139,9 +139,6 @@
# define _ASM_EXTABLE_EX(from, to) \
_ASM_EXTABLE_HANDLE(from, to, ex_handler_ext)

-# define _ASM_EXTABLE_REFCOUNT(from, to) \
- _ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount)
-
# define _ASM_NOKPROBE(entry) \
.pushsection "_kprobe_blacklist","aw" ; \
_ASM_ALIGN ; \
@@ -170,9 +167,6 @@
# define _ASM_EXTABLE_EX(from, to) \
_ASM_EXTABLE_HANDLE(from, to, ex_handler_ext)

-# define _ASM_EXTABLE_REFCOUNT(from, to) \
- _ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount)
-
/* For C file, we already have NOKPROBE_SYMBOL macro */
#endif

diff --git a/arch/x86/include/asm/refcount.h b/arch/x86/include/asm/refcount.h
deleted file mode 100644
index 232f856e0db0..000000000000
--- a/arch/x86/include/asm/refcount.h
+++ /dev/null
@@ -1,126 +0,0 @@
-#ifndef __ASM_X86_REFCOUNT_H
-#define __ASM_X86_REFCOUNT_H
-/*
- * x86-specific implementation of refcount_t. Based on PAX_REFCOUNT from
- * PaX/grsecurity.
- */
-#include <linux/refcount.h>
-#include <asm/bug.h>
-
-/*
- * This is the first portion of the refcount error handling, which lives in
- * .text.unlikely, and is jumped to from the CPU flag check (in the
- * following macros). This saves the refcount value location into CX for
- * the exception handler to use (in mm/extable.c), and then triggers the
- * central refcount exception. The fixup address for the exception points
- * back to the regular execution flow in .text.
- */
-#define _REFCOUNT_EXCEPTION \
- ".pushsection .text..refcount\n" \
- "111:\tlea %[var], %%" _ASM_CX "\n" \
- "112:\t" ASM_UD2 "\n" \
- ASM_UNREACHABLE \
- ".popsection\n" \
- "113:\n" \
- _ASM_EXTABLE_REFCOUNT(112b, 113b)
-
-/* Trigger refcount exception if refcount result is negative. */
-#define REFCOUNT_CHECK_LT_ZERO \
- "js 111f\n\t" \
- _REFCOUNT_EXCEPTION
-
-/* Trigger refcount exception if refcount result is zero or negative. */
-#define REFCOUNT_CHECK_LE_ZERO \
- "jz 111f\n\t" \
- REFCOUNT_CHECK_LT_ZERO
-
-/* Trigger refcount exception unconditionally. */
-#define REFCOUNT_ERROR \
- "jmp 111f\n\t" \
- _REFCOUNT_EXCEPTION
-
-static __always_inline void refcount_add(unsigned int i, refcount_t *r)
-{
- asm volatile(LOCK_PREFIX "addl %1,%0\n\t"
- REFCOUNT_CHECK_LT_ZERO
- : [var] "+m" (r->refs.counter)
- : "ir" (i)
- : "cc", "cx");
-}
-
-static __always_inline void refcount_inc(refcount_t *r)
-{
- asm volatile(LOCK_PREFIX "incl %0\n\t"
- REFCOUNT_CHECK_LT_ZERO
- : [var] "+m" (r->refs.counter)
- : : "cc", "cx");
-}
-
-static __always_inline void refcount_dec(refcount_t *r)
-{
- asm volatile(LOCK_PREFIX "decl %0\n\t"
- REFCOUNT_CHECK_LE_ZERO
- : [var] "+m" (r->refs.counter)
- : : "cc", "cx");
-}
-
-static __always_inline __must_check
-bool refcount_sub_and_test(unsigned int i, refcount_t *r)
-{
- bool ret = GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl",
- REFCOUNT_CHECK_LT_ZERO,
- r->refs.counter, e, "er", i, "cx");
-
- if (ret) {
- smp_acquire__after_ctrl_dep();
- return true;
- }
-
- return false;
-}
-
-static __always_inline __must_check bool refcount_dec_and_test(refcount_t *r)
-{
- bool ret = GEN_UNARY_SUFFIXED_RMWcc(LOCK_PREFIX "decl",
- REFCOUNT_CHECK_LT_ZERO,
- r->refs.counter, e, "cx");
-
- if (ret) {
- smp_acquire__after_ctrl_dep();
- return true;
- }
-
- return false;
-}
-
-static __always_inline __must_check
-bool refcount_add_not_zero(unsigned int i, refcount_t *r)
-{
- int c, result;
-
- c = atomic_read(&(r->refs));
- do {
- if (unlikely(c == 0))
- return false;
-
- result = c + i;
-
- /* Did we try to increment from/to an undesirable state? */
- if (unlikely(c < 0 || c == INT_MAX || result < c)) {
- asm volatile(REFCOUNT_ERROR
- : : [var] "m" (r->refs.counter)
- : "cc", "cx");
- break;
- }
-
- } while (!atomic_try_cmpxchg(&(r->refs), &c, result));
-
- return c != 0;
-}
-
-static __always_inline __must_check bool refcount_inc_not_zero(refcount_t *r)
-{
- return refcount_add_not_zero(1, r);
-}
-
-#endif
diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
index 4d75bc656f97..30bb0bd3b1b8 100644
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -44,55 +44,6 @@ __visible bool ex_handler_fault(const struct exception_table_entry *fixup,
}
EXPORT_SYMBOL_GPL(ex_handler_fault);

-/*
- * Handler for UD0 exception following a failed test against the
- * result of a refcount inc/dec/add/sub.
- */
-__visible bool ex_handler_refcount(const struct exception_table_entry *fixup,
- struct pt_regs *regs, int trapnr,
- unsigned long error_code,
- unsigned long fault_addr)
-{
- /* First unconditionally saturate the refcount. */
- *(int *)regs->cx = INT_MIN / 2;
-
- /*
- * Strictly speaking, this reports the fixup destination, not
- * the fault location, and not the actually overflowing
- * instruction, which is the instruction before the "js", but
- * since that instruction could be a variety of lengths, just
- * report the location after the overflow, which should be close
- * enough for finding the overflow, as it's at least back in
- * the function, having returned from .text.unlikely.
- */
- regs->ip = ex_fixup_addr(fixup);
-
- /*
- * This function has been called because either a negative refcount
- * value was seen by any of the refcount functions, or a zero
- * refcount value was seen by refcount_dec().
- *
- * If we crossed from INT_MAX to INT_MIN, OF (Overflow Flag: result
- * wrapped around) will be set. Additionally, seeing the refcount
- * reach 0 will set ZF (Zero Flag: result was zero). In each of
- * these cases we want a report, since it's a boundary condition.
- * The SF case is not reported since it indicates post-boundary
- * manipulations below zero or above INT_MAX. And if none of the
- * flags are set, something has gone very wrong, so report it.
- */
- if (regs->flags & (X86_EFLAGS_OF | X86_EFLAGS_ZF)) {
- bool zero = regs->flags & X86_EFLAGS_ZF;
-
- refcount_error_report(regs, zero ? "hit zero" : "overflow");
- } else if ((regs->flags & X86_EFLAGS_SF) == 0) {
- /* Report if none of OF, ZF, nor SF are set. */
- refcount_error_report(regs, "unexpected saturation");
- }
-
- return true;
-}
-EXPORT_SYMBOL(ex_handler_refcount);
-
/*
* Handler for when we fail to restore a task's FPU state. We should never get
* here because the FPU state of a task using the FPU (task->thread.fpu.state)
diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 00786a142ff0..1400fce39c58 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -22,7 +22,6 @@ config DRM_I915_DEBUG
depends on DRM_I915
select DEBUG_FS
select PREEMPT_COUNT
- select REFCOUNT_FULL
select I2C_CHARDEV
select STACKDEPOT
select DRM_DP_AUX_CHARDEV
diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 757d4630115c..0ac50cf62d06 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -1,64 +1,4 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _LINUX_REFCOUNT_H
-#define _LINUX_REFCOUNT_H
-
-#include <linux/atomic.h>
-#include <linux/compiler.h>
-#include <linux/limits.h>
-#include <linux/spinlock_types.h>
-
-struct mutex;
-
-/**
- * struct refcount_t - variant of atomic_t specialized for reference counts
- * @refs: atomic_t counter field
- *
- * The counter saturates at REFCOUNT_SATURATED and will not move once
- * there. This avoids wrapping the counter and causing 'spurious'
- * use-after-free bugs.
- */
-typedef struct refcount_struct {
- atomic_t refs;
-} refcount_t;
-
-#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }
-#define REFCOUNT_MAX INT_MAX
-#define REFCOUNT_SATURATED (INT_MIN / 2)
-
-enum refcount_saturation_type {
- REFCOUNT_ADD_NOT_ZERO_OVF,
- REFCOUNT_ADD_OVF,
- REFCOUNT_ADD_UAF,
- REFCOUNT_SUB_UAF,
- REFCOUNT_DEC_LEAK,
-};
-
-void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t);
-
-/**
- * refcount_set - set a refcount's value
- * @r: the refcount
- * @n: value to which the refcount will be set
- */
-static inline void refcount_set(refcount_t *r, int n)
-{
- atomic_set(&r->refs, n);
-}
-
-/**
- * refcount_read - get a refcount's value
- * @r: the refcount
- *
- * Return: the refcount's value
- */
-static inline unsigned int refcount_read(const refcount_t *r)
-{
- return atomic_read(&r->refs);
-}
-
-#ifdef CONFIG_REFCOUNT_FULL
-#include <linux/bug.h>
-
/*
* Variant of atomic_t specialized for reference counts.
*
@@ -136,6 +76,64 @@ static inline unsigned int refcount_read(const refcount_t *r)
*
*/

+#ifndef _LINUX_REFCOUNT_H
+#define _LINUX_REFCOUNT_H
+
+#include <linux/atomic.h>
+#include <linux/bug.h>
+#include <linux/compiler.h>
+#include <linux/limits.h>
+#include <linux/spinlock_types.h>
+
+struct mutex;
+
+/**
+ * struct refcount_t - variant of atomic_t specialized for reference counts
+ * @refs: atomic_t counter field
+ *
+ * The counter saturates at REFCOUNT_SATURATED and will not move once
+ * there. This avoids wrapping the counter and causing 'spurious'
+ * use-after-free bugs.
+ */
+typedef struct refcount_struct {
+ atomic_t refs;
+} refcount_t;
+
+#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }
+#define REFCOUNT_MAX INT_MAX
+#define REFCOUNT_SATURATED (INT_MIN / 2)
+
+enum refcount_saturation_type {
+ REFCOUNT_ADD_NOT_ZERO_OVF,
+ REFCOUNT_ADD_OVF,
+ REFCOUNT_ADD_UAF,
+ REFCOUNT_SUB_UAF,
+ REFCOUNT_DEC_LEAK,
+};
+
+void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t);
+
+/**
+ * refcount_set - set a refcount's value
+ * @r: the refcount
+ * @n: value to which the refcount will be set
+ */
+static inline void refcount_set(refcount_t *r, int n)
+{
+ atomic_set(&r->refs, n);
+}
+
+/**
+ * refcount_read - get a refcount's value
+ * @r: the refcount
+ *
+ * Return: the refcount's value
+ */
+static inline unsigned int refcount_read(const refcount_t *r)
+{
+ return atomic_read(&r->refs);
+}
+
/**
* refcount_add_not_zero - add a value to a refcount unless it is 0
* @i: the value to add to the refcount
@@ -298,46 +296,6 @@ static inline void refcount_dec(refcount_t *r)
if (unlikely(atomic_fetch_sub_release(1, &r->refs) <= 1))
refcount_warn_saturate(r, REFCOUNT_DEC_LEAK);
}
-#else /* CONFIG_REFCOUNT_FULL */
-# ifdef CONFIG_ARCH_HAS_REFCOUNT
-# include <asm/refcount.h>
-# else
-static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
-{
- return atomic_add_unless(&r->refs, i, 0);
-}
-
-static inline void refcount_add(int i, refcount_t *r)
-{
- atomic_add(i, &r->refs);
-}
-
-static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
-{
- return atomic_add_unless(&r->refs, 1, 0);
-}
-
-static inline void refcount_inc(refcount_t *r)
-{
- atomic_inc(&r->refs);
-}
-
-static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
-{
- return atomic_sub_and_test(i, &r->refs);
-}
-
-static inline __must_check bool refcount_dec_and_test(refcount_t *r)
-{
- return atomic_dec_and_test(&r->refs);
-}
-
-static inline void refcount_dec(refcount_t *r)
-{
- atomic_dec(&r->refs);
-}
-# endif /* !CONFIG_ARCH_HAS_REFCOUNT */
-#endif /* !CONFIG_REFCOUNT_FULL */

extern __must_check bool refcount_dec_if_one(refcount_t *r);
extern __must_check bool refcount_dec_not_one(refcount_t *r);
diff --git a/lib/refcount.c b/lib/refcount.c
index 8b7e249c0e10..ebac8b7d15a7 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
/*
- * Out-of-line refcount functions common to all refcount implementations.
+ * Out-of-line refcount functions.
*/

#include <linux/mutex.h>
--
2.24.0.432.g9d3f5f5b63-goog

2019-11-21 12:02:05

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 05/10] lib/refcount: Improve performance of generic REFCOUNT_FULL code

Rewrite the generic REFCOUNT_FULL implementation so that the saturation
point is moved to INT_MIN / 2. This allows us to defer the sanity checks
until after the atomic operation, which removes many uses of cmpxchg()
in favour of atomic_fetch_{add,sub}().

Some crude perf results obtained from lkdtm show substantially less
overhead, despite the checking:

$ perf stat -r 3 -B -- echo {ATOMIC,REFCOUNT}_TIMING >/sys/kernel/debug/provoke-crash/DIRECT

# arm64
ATOMIC_TIMING: 46.50451 +- 0.00134 seconds time elapsed ( +- 0.00% )
REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 77.57522 +- 0.00982 seconds time elapsed ( +- 0.01% )
REFCOUNT_TIMING (REFCOUNT_FULL, this series): 48.7181 +- 0.0256 seconds time elapsed ( +- 0.05% )

# x86
ATOMIC_TIMING: 31.6225 +- 0.0776 seconds time elapsed ( +- 0.25% )
REFCOUNT_TIMING (!REFCOUNT_FULL, mainline/x86 asm): 31.6689 +- 0.0901 seconds time elapsed ( +- 0.28% )
REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 53.203 +- 0.138 seconds time elapsed ( +- 0.26% )
REFCOUNT_TIMING (REFCOUNT_FULL, this series): 31.7408 +- 0.0486 seconds time elapsed ( +- 0.15% )

Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Tested-by: Jan Glauber <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/refcount.h | 131 ++++++++++++++++++++++-----------------
1 file changed, 75 insertions(+), 56 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index e719b5b1220e..e3b218d669ce 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -47,8 +47,8 @@ static inline unsigned int refcount_read(const refcount_t *r)
#ifdef CONFIG_REFCOUNT_FULL
#include <linux/bug.h>

-#define REFCOUNT_MAX (UINT_MAX - 1)
-#define REFCOUNT_SATURATED UINT_MAX
+#define REFCOUNT_MAX INT_MAX
+#define REFCOUNT_SATURATED (INT_MIN / 2)

/*
* Variant of atomic_t specialized for reference counts.
@@ -56,9 +56,47 @@ static inline unsigned int refcount_read(const refcount_t *r)
* The interface matches the atomic_t interface (to aid in porting) but only
* provides the few functions one should use for reference counting.
*
- * It differs in that the counter saturates at REFCOUNT_SATURATED and will not
- * move once there. This avoids wrapping the counter and causing 'spurious'
- * use-after-free issues.
+ * Saturation semantics
+ * ====================
+ *
+ * refcount_t differs from atomic_t in that the counter saturates at
+ * REFCOUNT_SATURATED and will not move once there. This avoids wrapping the
+ * counter and causing 'spurious' use-after-free issues. In order to avoid the
+ * cost associated with introducing cmpxchg() loops into all of the saturating
+ * operations, we temporarily allow the counter to take on an unchecked value
+ * and then explicitly set it to REFCOUNT_SATURATED on detecting that underflow
+ * or overflow has occurred. Although this is racy when multiple threads
+ * access the refcount concurrently, by placing REFCOUNT_SATURATED roughly
+ * equidistant from 0 and INT_MAX we minimise the scope for error:
+ *
+ * INT_MAX REFCOUNT_SATURATED UINT_MAX
+ * 0 (0x7fff_ffff) (0xc000_0000) (0xffff_ffff)
+ * +--------------------------------+----------------+----------------+
+ * <---------- bad value! ---------->
+ *
+ * (in a signed view of the world, the "bad value" range corresponds to
+ * a negative counter value).
+ *
+ * As an example, consider a refcount_inc() operation that causes the counter
+ * to overflow:
+ *
+ * int old = atomic_fetch_add_relaxed(r);
+ * // old is INT_MAX, refcount now INT_MIN (0x8000_0000)
+ * if (old < 0)
+ * atomic_set(r, REFCOUNT_SATURATED);
+ *
+ * If another thread also performs a refcount_inc() operation between the two
+ * atomic operations, then the count will continue to edge closer to 0. If it
+ * reaches a value of 1 before /any/ of the threads reset it to the saturated
+ * value, then a concurrent refcount_dec_and_test() may erroneously free the
+ * underlying object. Given the precise timing details involved with the
+ * round-robin scheduling of each thread manipulating the refcount and the need
+ * to hit the race multiple times in succession, there doesn't appear to be a
+ * practical avenue of attack even if using refcount_add() operations with
+ * larger increments.
+ *
+ * Memory ordering
+ * ===============
*
* Memory ordering rules are slightly relaxed wrt regular atomic_t functions
* and provide only what is strictly required for refcounts.
@@ -109,25 +147,19 @@ static inline unsigned int refcount_read(const refcount_t *r)
*/
static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
{
- unsigned int new, val = atomic_read(&r->refs);
+ int old = refcount_read(r);

do {
- if (!val)
- return false;
-
- if (unlikely(val == REFCOUNT_SATURATED))
- return true;
-
- new = val + i;
- if (new < val)
- new = REFCOUNT_SATURATED;
+ if (!old)
+ break;
+ } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));

- } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
-
- WARN_ONCE(new == REFCOUNT_SATURATED,
- "refcount_t: saturated; leaking memory.\n");
+ if (unlikely(old < 0 || old + i < 0)) {
+ refcount_set(r, REFCOUNT_SATURATED);
+ WARN_ONCE(1, "refcount_t: saturated; leaking memory.\n");
+ }

- return true;
+ return old;
}

/**
@@ -148,7 +180,13 @@ static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
*/
static inline void refcount_add(int i, refcount_t *r)
{
- WARN_ONCE(!refcount_add_not_zero(i, r), "refcount_t: addition on 0; use-after-free.\n");
+ int old = atomic_fetch_add_relaxed(i, &r->refs);
+
+ WARN_ONCE(!old, "refcount_t: addition on 0; use-after-free.\n");
+ if (unlikely(old <= 0 || old + i <= 0)) {
+ refcount_set(r, REFCOUNT_SATURATED);
+ WARN_ONCE(old, "refcount_t: saturated; leaking memory.\n");
+ }
}

/**
@@ -166,23 +204,7 @@ static inline void refcount_add(int i, refcount_t *r)
*/
static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- new = val + 1;
-
- if (!val)
- return false;
-
- if (unlikely(!new))
- return true;
-
- } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
-
- WARN_ONCE(new == REFCOUNT_SATURATED,
- "refcount_t: saturated; leaking memory.\n");
-
- return true;
+ return refcount_add_not_zero(1, r);
}

/**
@@ -199,7 +221,7 @@ static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
*/
static inline void refcount_inc(refcount_t *r)
{
- WARN_ONCE(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
+ refcount_add(1, r);
}

/**
@@ -224,26 +246,19 @@ static inline void refcount_inc(refcount_t *r)
*/
static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- if (unlikely(val == REFCOUNT_SATURATED))
- return false;
+ int old = atomic_fetch_sub_release(i, &r->refs);

- new = val - i;
- if (new > val) {
- WARN_ONCE(new > val, "refcount_t: underflow; use-after-free.\n");
- return false;
- }
-
- } while (!atomic_try_cmpxchg_release(&r->refs, &val, new));
-
- if (!new) {
+ if (old == i) {
smp_acquire__after_ctrl_dep();
return true;
}
- return false;

+ if (unlikely(old < 0 || old - i < 0)) {
+ refcount_set(r, REFCOUNT_SATURATED);
+ WARN_ONCE(1, "refcount_t: underflow; use-after-free.\n");
+ }
+
+ return false;
}

/**
@@ -276,9 +291,13 @@ static inline __must_check bool refcount_dec_and_test(refcount_t *r)
*/
static inline void refcount_dec(refcount_t *r)
{
- WARN_ONCE(refcount_dec_and_test(r), "refcount_t: decrement hit 0; leaking memory.\n");
-}
+ int old = atomic_fetch_sub_release(1, &r->refs);

+ if (unlikely(old <= 1)) {
+ refcount_set(r, REFCOUNT_SATURATED);
+ WARN_ONCE(1, "refcount_t: decrement hit 0; leaking memory.\n");
+ }
+}
#else /* CONFIG_REFCOUNT_FULL */

#define REFCOUNT_MAX INT_MAX
--
2.24.0.432.g9d3f5f5b63-goog

2019-11-21 12:02:10

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 09/10] lib/refcount: Remove unused 'refcount_error_report()' function

'refcount_error_report()' has no callers. Remove it.

Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Kees Cook <[email protected]>
Acked-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/kernel.h | 7 -------
kernel/panic.c | 11 -----------
2 files changed, 18 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d83d403dac2e..09f759228e3f 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -328,13 +328,6 @@ extern int oops_may_print(void);
void do_exit(long error_code) __noreturn;
void complete_and_exit(struct completion *, long) __noreturn;

-#ifdef CONFIG_ARCH_HAS_REFCOUNT
-void refcount_error_report(struct pt_regs *regs, const char *err);
-#else
-static inline void refcount_error_report(struct pt_regs *regs, const char *err)
-{ }
-#endif
-
/* Internal, do not use. */
int __must_check _kstrtoul(const char *s, unsigned int base, unsigned long *res);
int __must_check _kstrtol(const char *s, unsigned int base, long *res);
diff --git a/kernel/panic.c b/kernel/panic.c
index 47e8ebccc22b..10d05fd4f9c3 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -670,17 +670,6 @@ EXPORT_SYMBOL(__stack_chk_fail);

#endif

-#ifdef CONFIG_ARCH_HAS_REFCOUNT
-void refcount_error_report(struct pt_regs *regs, const char *err)
-{
- WARN_RATELIMIT(1, "refcount_t %s at %pB in %s[%d], uid/euid: %u/%u\n",
- err, (void *)instruction_pointer(regs),
- current->comm, task_pid_nr(current),
- from_kuid_munged(&init_user_ns, current_uid()),
- from_kuid_munged(&init_user_ns, current_euid()));
-}
-#endif
-
core_param(panic, panic_timeout, int, 0644);
core_param(panic_print, panic_print, ulong, 0644);
core_param(pause_on_oops, pause_on_oops, int, 0644);
--
2.24.0.432.g9d3f5f5b63-goog

2019-11-21 12:02:30

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 07/10] lib/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions

The definitions of REFCOUNT_MAX and REFCOUNT_SATURATED are the same,
regardless of CONFIG_REFCOUNT_FULL, so consolidate them into a single
pair of definitions.

Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/refcount.h | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 1cd0a876a789..757d4630115c 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -22,6 +22,8 @@ typedef struct refcount_struct {
} refcount_t;

#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }
+#define REFCOUNT_MAX INT_MAX
+#define REFCOUNT_SATURATED (INT_MIN / 2)

enum refcount_saturation_type {
REFCOUNT_ADD_NOT_ZERO_OVF,
@@ -57,9 +59,6 @@ static inline unsigned int refcount_read(const refcount_t *r)
#ifdef CONFIG_REFCOUNT_FULL
#include <linux/bug.h>

-#define REFCOUNT_MAX INT_MAX
-#define REFCOUNT_SATURATED (INT_MIN / 2)
-
/*
* Variant of atomic_t specialized for reference counts.
*
@@ -300,10 +299,6 @@ static inline void refcount_dec(refcount_t *r)
refcount_warn_saturate(r, REFCOUNT_DEC_LEAK);
}
#else /* CONFIG_REFCOUNT_FULL */
-
-#define REFCOUNT_MAX INT_MAX
-#define REFCOUNT_SATURATED (INT_MIN / 2)
-
# ifdef CONFIG_ARCH_HAS_REFCOUNT
# include <asm/refcount.h>
# else
--
2.24.0.432.g9d3f5f5b63-goog

2019-11-21 12:02:35

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 10/10] drivers/lkdtm: Remove references to CONFIG_REFCOUNT_FULL

CONFIG_REFCOUNT_FULL no longer exists, so remove all references to it.

Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Kees Cook <[email protected]>
Acked-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
drivers/misc/lkdtm/refcount.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/misc/lkdtm/refcount.c b/drivers/misc/lkdtm/refcount.c
index abf3b7c1f686..de7c5ab528d9 100644
--- a/drivers/misc/lkdtm/refcount.c
+++ b/drivers/misc/lkdtm/refcount.c
@@ -119,7 +119,7 @@ void lkdtm_REFCOUNT_DEC_ZERO(void)
static void check_negative(refcount_t *ref, int start)
{
/*
- * CONFIG_REFCOUNT_FULL refuses to move a refcount at all on an
+ * refcount_t refuses to move a refcount at all on an
* over-sub, so we have to track our starting position instead of
* looking only at zero-pinning.
*/
@@ -202,7 +202,6 @@ static void check_from_zero(refcount_t *ref)

/*
* A refcount_inc() from zero should pin to zero or saturate and may WARN.
- * Only CONFIG_REFCOUNT_FULL provides this protection currently.
*/
void lkdtm_REFCOUNT_INC_ZERO(void)
{
--
2.24.0.432.g9d3f5f5b63-goog

2019-11-21 12:03:11

by Will Deacon

[permalink] [raw]
Subject: [RESEND PATCH v4 03/10] lib/refcount: Remove unused refcount_*_checked() variants

The full-fat refcount implementation is exposed via a set of functions
suffixed with "_checked()", the idea being that code can choose to use
the more expensive, yet more secure implementation on a case-by-case
basis.

In reality, this hasn't happened, so with a grand total of zero users,
let's remove the checked variants for now by simply dropping the suffix
and predicating the out-of-line functions on CONFIG_REFCOUNT_FULL=y.

Cc: Ingo Molnar <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
include/linux/refcount.h | 25 ++++++-------------
lib/refcount.c | 54 +++++++++++++++++++++-------------------
2 files changed, 36 insertions(+), 43 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 89066a1471dd..edd505d1a23b 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -44,32 +44,21 @@ static inline unsigned int refcount_read(const refcount_t *r)
return atomic_read(&r->refs);
}

-extern __must_check bool refcount_add_not_zero_checked(int i, refcount_t *r);
-extern void refcount_add_checked(int i, refcount_t *r);
-
-extern __must_check bool refcount_inc_not_zero_checked(refcount_t *r);
-extern void refcount_inc_checked(refcount_t *r);
-
-extern __must_check bool refcount_sub_and_test_checked(int i, refcount_t *r);
-
-extern __must_check bool refcount_dec_and_test_checked(refcount_t *r);
-extern void refcount_dec_checked(refcount_t *r);
-
#ifdef CONFIG_REFCOUNT_FULL

#define REFCOUNT_MAX (UINT_MAX - 1)
#define REFCOUNT_SATURATED UINT_MAX

-#define refcount_add_not_zero refcount_add_not_zero_checked
-#define refcount_add refcount_add_checked
+extern __must_check bool refcount_add_not_zero(int i, refcount_t *r);
+extern void refcount_add(int i, refcount_t *r);

-#define refcount_inc_not_zero refcount_inc_not_zero_checked
-#define refcount_inc refcount_inc_checked
+extern __must_check bool refcount_inc_not_zero(refcount_t *r);
+extern void refcount_inc(refcount_t *r);

-#define refcount_sub_and_test refcount_sub_and_test_checked
+extern __must_check bool refcount_sub_and_test(int i, refcount_t *r);

-#define refcount_dec_and_test refcount_dec_and_test_checked
-#define refcount_dec refcount_dec_checked
+extern __must_check bool refcount_dec_and_test(refcount_t *r);
+extern void refcount_dec(refcount_t *r);

#else

diff --git a/lib/refcount.c b/lib/refcount.c
index 719b0bc42ab1..a2f670998cee 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -43,8 +43,10 @@
#include <linux/spinlock.h>
#include <linux/bug.h>

+#ifdef CONFIG_REFCOUNT_FULL
+
/**
- * refcount_add_not_zero_checked - add a value to a refcount unless it is 0
+ * refcount_add_not_zero - add a value to a refcount unless it is 0
* @i: the value to add to the refcount
* @r: the refcount
*
@@ -61,7 +63,7 @@
*
* Return: false if the passed refcount is 0, true otherwise
*/
-bool refcount_add_not_zero_checked(int i, refcount_t *r)
+bool refcount_add_not_zero(int i, refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

@@ -83,10 +85,10 @@ bool refcount_add_not_zero_checked(int i, refcount_t *r)

return true;
}
-EXPORT_SYMBOL(refcount_add_not_zero_checked);
+EXPORT_SYMBOL(refcount_add_not_zero);

/**
- * refcount_add_checked - add a value to a refcount
+ * refcount_add - add a value to a refcount
* @i: the value to add to the refcount
* @r: the refcount
*
@@ -101,14 +103,14 @@ EXPORT_SYMBOL(refcount_add_not_zero_checked);
* cases, refcount_inc(), or one of its variants, should instead be used to
* increment a reference count.
*/
-void refcount_add_checked(int i, refcount_t *r)
+void refcount_add(int i, refcount_t *r)
{
- WARN_ONCE(!refcount_add_not_zero_checked(i, r), "refcount_t: addition on 0; use-after-free.\n");
+ WARN_ONCE(!refcount_add_not_zero(i, r), "refcount_t: addition on 0; use-after-free.\n");
}
-EXPORT_SYMBOL(refcount_add_checked);
+EXPORT_SYMBOL(refcount_add);

/**
- * refcount_inc_not_zero_checked - increment a refcount unless it is 0
+ * refcount_inc_not_zero - increment a refcount unless it is 0
* @r: the refcount to increment
*
* Similar to atomic_inc_not_zero(), but will saturate at REFCOUNT_SATURATED
@@ -120,7 +122,7 @@ EXPORT_SYMBOL(refcount_add_checked);
*
* Return: true if the increment was successful, false otherwise
*/
-bool refcount_inc_not_zero_checked(refcount_t *r)
+bool refcount_inc_not_zero(refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

@@ -140,10 +142,10 @@ bool refcount_inc_not_zero_checked(refcount_t *r)

return true;
}
-EXPORT_SYMBOL(refcount_inc_not_zero_checked);
+EXPORT_SYMBOL(refcount_inc_not_zero);

/**
- * refcount_inc_checked - increment a refcount
+ * refcount_inc - increment a refcount
* @r: the refcount to increment
*
* Similar to atomic_inc(), but will saturate at REFCOUNT_SATURATED and WARN.
@@ -154,14 +156,14 @@ EXPORT_SYMBOL(refcount_inc_not_zero_checked);
* Will WARN if the refcount is 0, as this represents a possible use-after-free
* condition.
*/
-void refcount_inc_checked(refcount_t *r)
+void refcount_inc(refcount_t *r)
{
- WARN_ONCE(!refcount_inc_not_zero_checked(r), "refcount_t: increment on 0; use-after-free.\n");
+ WARN_ONCE(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
}
-EXPORT_SYMBOL(refcount_inc_checked);
+EXPORT_SYMBOL(refcount_inc);

/**
- * refcount_sub_and_test_checked - subtract from a refcount and test if it is 0
+ * refcount_sub_and_test - subtract from a refcount and test if it is 0
* @i: amount to subtract from the refcount
* @r: the refcount
*
@@ -180,7 +182,7 @@ EXPORT_SYMBOL(refcount_inc_checked);
*
* Return: true if the resulting refcount is 0, false otherwise
*/
-bool refcount_sub_and_test_checked(int i, refcount_t *r)
+bool refcount_sub_and_test(int i, refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

@@ -203,10 +205,10 @@ bool refcount_sub_and_test_checked(int i, refcount_t *r)
return false;

}
-EXPORT_SYMBOL(refcount_sub_and_test_checked);
+EXPORT_SYMBOL(refcount_sub_and_test);

/**
- * refcount_dec_and_test_checked - decrement a refcount and test if it is 0
+ * refcount_dec_and_test - decrement a refcount and test if it is 0
* @r: the refcount
*
* Similar to atomic_dec_and_test(), it will WARN on underflow and fail to
@@ -218,14 +220,14 @@ EXPORT_SYMBOL(refcount_sub_and_test_checked);
*
* Return: true if the resulting refcount is 0, false otherwise
*/
-bool refcount_dec_and_test_checked(refcount_t *r)
+bool refcount_dec_and_test(refcount_t *r)
{
- return refcount_sub_and_test_checked(1, r);
+ return refcount_sub_and_test(1, r);
}
-EXPORT_SYMBOL(refcount_dec_and_test_checked);
+EXPORT_SYMBOL(refcount_dec_and_test);

/**
- * refcount_dec_checked - decrement a refcount
+ * refcount_dec - decrement a refcount
* @r: the refcount
*
* Similar to atomic_dec(), it will WARN on underflow and fail to decrement
@@ -234,11 +236,13 @@ EXPORT_SYMBOL(refcount_dec_and_test_checked);
* Provides release memory ordering, such that prior loads and stores are done
* before.
*/
-void refcount_dec_checked(refcount_t *r)
+void refcount_dec(refcount_t *r)
{
- WARN_ONCE(refcount_dec_and_test_checked(r), "refcount_t: decrement hit 0; leaking memory.\n");
+ WARN_ONCE(refcount_dec_and_test(r), "refcount_t: decrement hit 0; leaking memory.\n");
}
-EXPORT_SYMBOL(refcount_dec_checked);
+EXPORT_SYMBOL(refcount_dec);
+
+#endif /* CONFIG_REFCOUNT_FULL */

/**
* refcount_dec_if_one - decrement a refcount if it is 1
--
2.24.0.432.g9d3f5f5b63-goog

2019-11-21 12:09:39

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 00/10] Rework REFCOUNT_FULL using atomic_fetch_* operations

On Thu, 21 Nov 2019 at 12:59, Will Deacon <[email protected]> wrote:
>
> Hi everybody,
>
> This is a resend of version four of the patches I posted here:
>
> v4: https://lore.kernel.org/lkml/[email protected]
>
> Previous versions can be found at:
>
> v1: https://lkml.kernel.org/r/[email protected]
> v2: https://lkml.kernel.org/r/[email protected]
> v3: https://lkml.kernel.org/r/[email protected]
>
> I didn't receive any feedback last time around, other than some positive
> noises from Kees, so please consider this for inclusion in mainline.
>

For the series,

Reviewed-by: Ard Biesheuvel <[email protected]>


> Cc: Kees Cook <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Elena Reshetova <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Hanjun Guo <[email protected]>
>
> --->8
>
> Will Deacon (10):
> lib/refcount: Define constants for saturation and max refcount values
> lib/refcount: Ensure integer operands are treated as signed
> lib/refcount: Remove unused refcount_*_checked() variants
> lib/refcount: Move bulk of REFCOUNT_FULL implementation into header
> lib/refcount: Improve performance of generic REFCOUNT_FULL code
> lib/refcount: Move saturation warnings out of line
> lib/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions
> refcount: Consolidate implementations of refcount_t
> lib/refcount: Remove unused 'refcount_error_report()' function
> drivers/lkdtm: Remove references to CONFIG_REFCOUNT_FULL
>
> arch/Kconfig | 21 ---
> arch/arm/Kconfig | 1 -
> arch/arm64/Kconfig | 1 -
> arch/s390/configs/debug_defconfig | 1 -
> arch/x86/Kconfig | 1 -
> arch/x86/include/asm/asm.h | 6 -
> arch/x86/include/asm/refcount.h | 126 --------------
> arch/x86/mm/extable.c | 49 ------
> drivers/gpu/drm/i915/Kconfig.debug | 1 -
> drivers/misc/lkdtm/refcount.c | 11 +-
> include/linux/kernel.h | 7 -
> include/linux/refcount.h | 269 ++++++++++++++++++++++++-----
> kernel/panic.c | 11 --
> lib/refcount.c | 255 +++------------------------
> 14 files changed, 257 insertions(+), 503 deletions(-)
> delete mode 100644 arch/x86/include/asm/refcount.h
>
> --
> 2.24.0.432.g9d3f5f5b63-goog
>

2019-11-21 14:59:11

by David Sterba

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 03/10] lib/refcount: Remove unused refcount_*_checked() variants

On Thu, Nov 21, 2019 at 11:58:55AM +0000, Will Deacon wrote:
> The full-fat refcount implementation is exposed via a set of functions
> suffixed with "_checked()", the idea being that code can choose to use
> the more expensive, yet more secure implementation on a case-by-case
> basis.
>
> In reality, this hasn't happened, so with a grand total of zero users,
> let's remove the checked variants for now by simply dropping the suffix
> and predicating the out-of-line functions on CONFIG_REFCOUNT_FULL=y.

I am still interested in the _checked versions and have a WIP patch that
adds that to btrfs (that was my original plan) but haven't had enough
time to finalize it. The patch itself is simple, the missing part is to
understand and document what the saturated counters would do with the
structures.

If the _checked helpers are really bothering for you, then well remove
it, but I think it's a good API extension that makes the full-refcount
semantics independent of the config.

2019-11-21 17:15:41

by Kees Cook

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 03/10] lib/refcount: Remove unused refcount_*_checked() variants

On Thu, Nov 21, 2019 at 03:55:33PM +0100, David Sterba wrote:
> On Thu, Nov 21, 2019 at 11:58:55AM +0000, Will Deacon wrote:
> > The full-fat refcount implementation is exposed via a set of functions
> > suffixed with "_checked()", the idea being that code can choose to use
> > the more expensive, yet more secure implementation on a case-by-case
> > basis.
> >
> > In reality, this hasn't happened, so with a grand total of zero users,
> > let's remove the checked variants for now by simply dropping the suffix
> > and predicating the out-of-line functions on CONFIG_REFCOUNT_FULL=y.
>
> I am still interested in the _checked versions and have a WIP patch that
> adds that to btrfs (that was my original plan) but haven't had enough
> time to finalize it. The patch itself is simple, the missing part is to
> understand and document what the saturated counters would do with the
> structures.

The good news is that this series removes the case of refcount_t _not_
being checked, so there's no need for _checked helpers.
CONFIG_REFCOUNT_FULL gets removed because all refcount_t ends up being
checked on all architectures. No extra work needed! :) (See patch 8)

--
Kees Cook

2019-11-21 17:20:21

by David Sterba

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 03/10] lib/refcount: Remove unused refcount_*_checked() variants

On Thu, Nov 21, 2019 at 09:11:58AM -0800, Kees Cook wrote:
> On Thu, Nov 21, 2019 at 03:55:33PM +0100, David Sterba wrote:
> > On Thu, Nov 21, 2019 at 11:58:55AM +0000, Will Deacon wrote:
> > > The full-fat refcount implementation is exposed via a set of functions
> > > suffixed with "_checked()", the idea being that code can choose to use
> > > the more expensive, yet more secure implementation on a case-by-case
> > > basis.
> > >
> > > In reality, this hasn't happened, so with a grand total of zero users,
> > > let's remove the checked variants for now by simply dropping the suffix
> > > and predicating the out-of-line functions on CONFIG_REFCOUNT_FULL=y.
> >
> > I am still interested in the _checked versions and have a WIP patch that
> > adds that to btrfs (that was my original plan) but haven't had enough
> > time to finalize it. The patch itself is simple, the missing part is to
> > understand and document what the saturated counters would do with the
> > structures.
>
> The good news is that this series removes the case of refcount_t _not_
> being checked, so there's no need for _checked helpers.
> CONFIG_REFCOUNT_FULL gets removed because all refcount_t ends up being
> checked on all architectures. No extra work needed! :) (See patch 8)

Oh I see, that's great. Thanks.

2019-11-22 03:47:44

by Hanjun Guo

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 00/10] Rework REFCOUNT_FULL using atomic_fetch_* operations

On 2019/11/21 19:58, Will Deacon wrote:
> Hi everybody,
>
> This is a resend of version four of the patches I posted here:
>
> v4: https://lore.kernel.org/lkml/[email protected]
>
> Previous versions can be found at:
>
> v1: https://lkml.kernel.org/r/[email protected]
> v2: https://lkml.kernel.org/r/[email protected]
> v3: https://lkml.kernel.org/r/[email protected]
>
> I didn't receive any feedback last time around, other than some positive
> noises from Kees, so please consider this for inclusion in mainline.

This time I tested this patch set on both ARM64 and x86 server, against
5.4-rc8.

On ARM64, similar test case and the similar performance improvement.

On X86, I tested on a 24 cores Xeon(R) CPU E5-2620, with kernel compile
(make -j) for test case, and no issue was found.

Tested-by: Hanjun Guo <[email protected]>

Thanks
Hanjun

2019-11-25 08:20:52

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: locking/core] locking/refcount: Remove unused 'refcount_error_report()' function

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 2f30b36943adca070f2e1551f701bd524ed1ae5a
Gitweb: https://git.kernel.org/tip/2f30b36943adca070f2e1551f701bd524ed1ae5a
Author: Will Deacon <[email protected]>
AuthorDate: Thu, 21 Nov 2019 11:59:01
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 25 Nov 2019 09:15:42 +01:00

locking/refcount: Remove unused 'refcount_error_report()' function

'refcount_error_report()' has no callers. Remove it.

Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Acked-by: Kees Cook <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/kernel.h | 7 -------
kernel/panic.c | 11 -----------
2 files changed, 18 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d83d403..09f7592 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -328,13 +328,6 @@ extern int oops_may_print(void);
void do_exit(long error_code) __noreturn;
void complete_and_exit(struct completion *, long) __noreturn;

-#ifdef CONFIG_ARCH_HAS_REFCOUNT
-void refcount_error_report(struct pt_regs *regs, const char *err);
-#else
-static inline void refcount_error_report(struct pt_regs *regs, const char *err)
-{ }
-#endif
-
/* Internal, do not use. */
int __must_check _kstrtoul(const char *s, unsigned int base, unsigned long *res);
int __must_check _kstrtol(const char *s, unsigned int base, long *res);
diff --git a/kernel/panic.c b/kernel/panic.c
index f470a03..b69ee9e 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -671,17 +671,6 @@ EXPORT_SYMBOL(__stack_chk_fail);

#endif

-#ifdef CONFIG_ARCH_HAS_REFCOUNT
-void refcount_error_report(struct pt_regs *regs, const char *err)
-{
- WARN_RATELIMIT(1, "refcount_t %s at %pB in %s[%d], uid/euid: %u/%u\n",
- err, (void *)instruction_pointer(regs),
- current->comm, task_pid_nr(current),
- from_kuid_munged(&init_user_ns, current_uid()),
- from_kuid_munged(&init_user_ns, current_euid()));
-}
-#endif
-
core_param(panic, panic_timeout, int, 0644);
core_param(panic_print, panic_print, ulong, 0644);
core_param(pause_on_oops, pause_on_oops, int, 0644);

2019-11-25 08:21:08

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: locking/core] locking/refcount: Improve performance of generic REFCOUNT_FULL code

The following commit has been merged into the locking/core branch of tip:

Commit-ID: dcb786493f3e48da3272b710028d42ec608cfda1
Gitweb: https://git.kernel.org/tip/dcb786493f3e48da3272b710028d42ec608cfda1
Author: Will Deacon <[email protected]>
AuthorDate: Thu, 21 Nov 2019 11:58:57
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 25 Nov 2019 09:15:10 +01:00

locking/refcount: Improve performance of generic REFCOUNT_FULL code

Rewrite the generic REFCOUNT_FULL implementation so that the saturation
point is moved to INT_MIN / 2. This allows us to defer the sanity checks
until after the atomic operation, which removes many uses of cmpxchg()
in favour of atomic_fetch_{add,sub}().

Some crude perf results obtained from lkdtm show substantially less
overhead, despite the checking:

$ perf stat -r 3 -B -- echo {ATOMIC,REFCOUNT}_TIMING >/sys/kernel/debug/provoke-crash/DIRECT

# arm64
ATOMIC_TIMING: 46.50451 +- 0.00134 seconds time elapsed ( +- 0.00% )
REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 77.57522 +- 0.00982 seconds time elapsed ( +- 0.01% )
REFCOUNT_TIMING (REFCOUNT_FULL, this series): 48.7181 +- 0.0256 seconds time elapsed ( +- 0.05% )

# x86
ATOMIC_TIMING: 31.6225 +- 0.0776 seconds time elapsed ( +- 0.25% )
REFCOUNT_TIMING (!REFCOUNT_FULL, mainline/x86 asm): 31.6689 +- 0.0901 seconds time elapsed ( +- 0.28% )
REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 53.203 +- 0.138 seconds time elapsed ( +- 0.26% )
REFCOUNT_TIMING (REFCOUNT_FULL, this series): 31.7408 +- 0.0486 seconds time elapsed ( +- 0.15% )

Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Tested-by: Jan Glauber <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/refcount.h | 131 +++++++++++++++++++++-----------------
1 file changed, 75 insertions(+), 56 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index e719b5b..e3b218d 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -47,8 +47,8 @@ static inline unsigned int refcount_read(const refcount_t *r)
#ifdef CONFIG_REFCOUNT_FULL
#include <linux/bug.h>

-#define REFCOUNT_MAX (UINT_MAX - 1)
-#define REFCOUNT_SATURATED UINT_MAX
+#define REFCOUNT_MAX INT_MAX
+#define REFCOUNT_SATURATED (INT_MIN / 2)

/*
* Variant of atomic_t specialized for reference counts.
@@ -56,9 +56,47 @@ static inline unsigned int refcount_read(const refcount_t *r)
* The interface matches the atomic_t interface (to aid in porting) but only
* provides the few functions one should use for reference counting.
*
- * It differs in that the counter saturates at REFCOUNT_SATURATED and will not
- * move once there. This avoids wrapping the counter and causing 'spurious'
- * use-after-free issues.
+ * Saturation semantics
+ * ====================
+ *
+ * refcount_t differs from atomic_t in that the counter saturates at
+ * REFCOUNT_SATURATED and will not move once there. This avoids wrapping the
+ * counter and causing 'spurious' use-after-free issues. In order to avoid the
+ * cost associated with introducing cmpxchg() loops into all of the saturating
+ * operations, we temporarily allow the counter to take on an unchecked value
+ * and then explicitly set it to REFCOUNT_SATURATED on detecting that underflow
+ * or overflow has occurred. Although this is racy when multiple threads
+ * access the refcount concurrently, by placing REFCOUNT_SATURATED roughly
+ * equidistant from 0 and INT_MAX we minimise the scope for error:
+ *
+ * INT_MAX REFCOUNT_SATURATED UINT_MAX
+ * 0 (0x7fff_ffff) (0xc000_0000) (0xffff_ffff)
+ * +--------------------------------+----------------+----------------+
+ * <---------- bad value! ---------->
+ *
+ * (in a signed view of the world, the "bad value" range corresponds to
+ * a negative counter value).
+ *
+ * As an example, consider a refcount_inc() operation that causes the counter
+ * to overflow:
+ *
+ * int old = atomic_fetch_add_relaxed(r);
+ * // old is INT_MAX, refcount now INT_MIN (0x8000_0000)
+ * if (old < 0)
+ * atomic_set(r, REFCOUNT_SATURATED);
+ *
+ * If another thread also performs a refcount_inc() operation between the two
+ * atomic operations, then the count will continue to edge closer to 0. If it
+ * reaches a value of 1 before /any/ of the threads reset it to the saturated
+ * value, then a concurrent refcount_dec_and_test() may erroneously free the
+ * underlying object. Given the precise timing details involved with the
+ * round-robin scheduling of each thread manipulating the refcount and the need
+ * to hit the race multiple times in succession, there doesn't appear to be a
+ * practical avenue of attack even if using refcount_add() operations with
+ * larger increments.
+ *
+ * Memory ordering
+ * ===============
*
* Memory ordering rules are slightly relaxed wrt regular atomic_t functions
* and provide only what is strictly required for refcounts.
@@ -109,25 +147,19 @@ static inline unsigned int refcount_read(const refcount_t *r)
*/
static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
{
- unsigned int new, val = atomic_read(&r->refs);
+ int old = refcount_read(r);

do {
- if (!val)
- return false;
-
- if (unlikely(val == REFCOUNT_SATURATED))
- return true;
-
- new = val + i;
- if (new < val)
- new = REFCOUNT_SATURATED;
+ if (!old)
+ break;
+ } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));

- } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
-
- WARN_ONCE(new == REFCOUNT_SATURATED,
- "refcount_t: saturated; leaking memory.\n");
+ if (unlikely(old < 0 || old + i < 0)) {
+ refcount_set(r, REFCOUNT_SATURATED);
+ WARN_ONCE(1, "refcount_t: saturated; leaking memory.\n");
+ }

- return true;
+ return old;
}

/**
@@ -148,7 +180,13 @@ static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
*/
static inline void refcount_add(int i, refcount_t *r)
{
- WARN_ONCE(!refcount_add_not_zero(i, r), "refcount_t: addition on 0; use-after-free.\n");
+ int old = atomic_fetch_add_relaxed(i, &r->refs);
+
+ WARN_ONCE(!old, "refcount_t: addition on 0; use-after-free.\n");
+ if (unlikely(old <= 0 || old + i <= 0)) {
+ refcount_set(r, REFCOUNT_SATURATED);
+ WARN_ONCE(old, "refcount_t: saturated; leaking memory.\n");
+ }
}

/**
@@ -166,23 +204,7 @@ static inline void refcount_add(int i, refcount_t *r)
*/
static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- new = val + 1;
-
- if (!val)
- return false;
-
- if (unlikely(!new))
- return true;
-
- } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
-
- WARN_ONCE(new == REFCOUNT_SATURATED,
- "refcount_t: saturated; leaking memory.\n");
-
- return true;
+ return refcount_add_not_zero(1, r);
}

/**
@@ -199,7 +221,7 @@ static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
*/
static inline void refcount_inc(refcount_t *r)
{
- WARN_ONCE(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
+ refcount_add(1, r);
}

/**
@@ -224,26 +246,19 @@ static inline void refcount_inc(refcount_t *r)
*/
static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- if (unlikely(val == REFCOUNT_SATURATED))
- return false;
+ int old = atomic_fetch_sub_release(i, &r->refs);

- new = val - i;
- if (new > val) {
- WARN_ONCE(new > val, "refcount_t: underflow; use-after-free.\n");
- return false;
- }
-
- } while (!atomic_try_cmpxchg_release(&r->refs, &val, new));
-
- if (!new) {
+ if (old == i) {
smp_acquire__after_ctrl_dep();
return true;
}
- return false;

+ if (unlikely(old < 0 || old - i < 0)) {
+ refcount_set(r, REFCOUNT_SATURATED);
+ WARN_ONCE(1, "refcount_t: underflow; use-after-free.\n");
+ }
+
+ return false;
}

/**
@@ -276,9 +291,13 @@ static inline __must_check bool refcount_dec_and_test(refcount_t *r)
*/
static inline void refcount_dec(refcount_t *r)
{
- WARN_ONCE(refcount_dec_and_test(r), "refcount_t: decrement hit 0; leaking memory.\n");
-}
+ int old = atomic_fetch_sub_release(1, &r->refs);

+ if (unlikely(old <= 1)) {
+ refcount_set(r, REFCOUNT_SATURATED);
+ WARN_ONCE(1, "refcount_t: decrement hit 0; leaking memory.\n");
+ }
+}
#else /* CONFIG_REFCOUNT_FULL */

#define REFCOUNT_MAX INT_MAX

2019-11-25 08:21:09

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: locking/core] locking/refcount: Ensure integer operands are treated as signed

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 97a1420adf0cdf0cf6f41bab0b2acf658c96b94b
Gitweb: https://git.kernel.org/tip/97a1420adf0cdf0cf6f41bab0b2acf658c96b94b
Author: Will Deacon <[email protected]>
AuthorDate: Thu, 21 Nov 2019 11:58:54
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 25 Nov 2019 09:14:57 +01:00

locking/refcount: Ensure integer operands are treated as signed

In preparation for changing the saturation point of REFCOUNT_FULL to
INT_MIN/2, change the type of integer operands passed into the API
from 'unsigned int' to 'int' so that we can avoid casting during
comparisons when we don't want to fall foul of C integral conversion
rules for signed and unsigned types.

Since the kernel is compiled with '-fno-strict-overflow', we don't need
to worry about the UB introduced by signed overflow here. Furthermore,
we're already making heavy use of the atomic_t API, which operates
exclusively on signed types.

Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/refcount.h | 14 +++++++-------
lib/refcount.c | 6 +++---
2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 79f62e8..89066a1 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -28,7 +28,7 @@ typedef struct refcount_struct {
* @r: the refcount
* @n: value to which the refcount will be set
*/
-static inline void refcount_set(refcount_t *r, unsigned int n)
+static inline void refcount_set(refcount_t *r, int n)
{
atomic_set(&r->refs, n);
}
@@ -44,13 +44,13 @@ static inline unsigned int refcount_read(const refcount_t *r)
return atomic_read(&r->refs);
}

-extern __must_check bool refcount_add_not_zero_checked(unsigned int i, refcount_t *r);
-extern void refcount_add_checked(unsigned int i, refcount_t *r);
+extern __must_check bool refcount_add_not_zero_checked(int i, refcount_t *r);
+extern void refcount_add_checked(int i, refcount_t *r);

extern __must_check bool refcount_inc_not_zero_checked(refcount_t *r);
extern void refcount_inc_checked(refcount_t *r);

-extern __must_check bool refcount_sub_and_test_checked(unsigned int i, refcount_t *r);
+extern __must_check bool refcount_sub_and_test_checked(int i, refcount_t *r);

extern __must_check bool refcount_dec_and_test_checked(refcount_t *r);
extern void refcount_dec_checked(refcount_t *r);
@@ -79,12 +79,12 @@ extern void refcount_dec_checked(refcount_t *r);
# ifdef CONFIG_ARCH_HAS_REFCOUNT
# include <asm/refcount.h>
# else
-static inline __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
+static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
{
return atomic_add_unless(&r->refs, i, 0);
}

-static inline void refcount_add(unsigned int i, refcount_t *r)
+static inline void refcount_add(int i, refcount_t *r)
{
atomic_add(i, &r->refs);
}
@@ -99,7 +99,7 @@ static inline void refcount_inc(refcount_t *r)
atomic_inc(&r->refs);
}

-static inline __must_check bool refcount_sub_and_test(unsigned int i, refcount_t *r)
+static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
{
return atomic_sub_and_test(i, &r->refs);
}
diff --git a/lib/refcount.c b/lib/refcount.c
index 48b78a4..719b0bc 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -61,7 +61,7 @@
*
* Return: false if the passed refcount is 0, true otherwise
*/
-bool refcount_add_not_zero_checked(unsigned int i, refcount_t *r)
+bool refcount_add_not_zero_checked(int i, refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

@@ -101,7 +101,7 @@ EXPORT_SYMBOL(refcount_add_not_zero_checked);
* cases, refcount_inc(), or one of its variants, should instead be used to
* increment a reference count.
*/
-void refcount_add_checked(unsigned int i, refcount_t *r)
+void refcount_add_checked(int i, refcount_t *r)
{
WARN_ONCE(!refcount_add_not_zero_checked(i, r), "refcount_t: addition on 0; use-after-free.\n");
}
@@ -180,7 +180,7 @@ EXPORT_SYMBOL(refcount_inc_checked);
*
* Return: true if the resulting refcount is 0, false otherwise
*/
-bool refcount_sub_and_test_checked(unsigned int i, refcount_t *r)
+bool refcount_sub_and_test_checked(int i, refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

2019-11-25 08:21:13

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: locking/core] lkdtm: Remove references to CONFIG_REFCOUNT_FULL

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 500543c53a54134ced386aed85cd93cf1363f981
Gitweb: https://git.kernel.org/tip/500543c53a54134ced386aed85cd93cf1363f981
Author: Will Deacon <[email protected]>
AuthorDate: Thu, 21 Nov 2019 11:59:02
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 25 Nov 2019 09:15:46 +01:00

lkdtm: Remove references to CONFIG_REFCOUNT_FULL

CONFIG_REFCOUNT_FULL no longer exists, so remove all references to it.

Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Acked-by: Kees Cook <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
drivers/misc/lkdtm/refcount.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/misc/lkdtm/refcount.c b/drivers/misc/lkdtm/refcount.c
index abf3b7c..de7c5ab 100644
--- a/drivers/misc/lkdtm/refcount.c
+++ b/drivers/misc/lkdtm/refcount.c
@@ -119,7 +119,7 @@ void lkdtm_REFCOUNT_DEC_ZERO(void)
static void check_negative(refcount_t *ref, int start)
{
/*
- * CONFIG_REFCOUNT_FULL refuses to move a refcount at all on an
+ * refcount_t refuses to move a refcount at all on an
* over-sub, so we have to track our starting position instead of
* looking only at zero-pinning.
*/
@@ -202,7 +202,6 @@ static void check_from_zero(refcount_t *ref)

/*
* A refcount_inc() from zero should pin to zero or saturate and may WARN.
- * Only CONFIG_REFCOUNT_FULL provides this protection currently.
*/
void lkdtm_REFCOUNT_INC_ZERO(void)
{

2019-11-25 08:21:59

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: locking/core] locking/refcount: Remove unused refcount_*_checked() variants

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 7221762c48c6bbbcc6cc51d8b803c06930215e34
Gitweb: https://git.kernel.org/tip/7221762c48c6bbbcc6cc51d8b803c06930215e34
Author: Will Deacon <[email protected]>
AuthorDate: Thu, 21 Nov 2019 11:58:55
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 25 Nov 2019 09:15:03 +01:00

locking/refcount: Remove unused refcount_*_checked() variants

The full-fat refcount implementation is exposed via a set of functions
suffixed with "_checked()", the idea being that code can choose to use
the more expensive, yet more secure implementation on a case-by-case
basis.

In reality, this hasn't happened, so with a grand total of zero users,
let's remove the checked variants for now by simply dropping the suffix
and predicating the out-of-line functions on CONFIG_REFCOUNT_FULL=y.

Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/refcount.h | 25 +++++-------------
lib/refcount.c | 54 ++++++++++++++++++++-------------------
2 files changed, 36 insertions(+), 43 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 89066a1..edd505d 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -44,32 +44,21 @@ static inline unsigned int refcount_read(const refcount_t *r)
return atomic_read(&r->refs);
}

-extern __must_check bool refcount_add_not_zero_checked(int i, refcount_t *r);
-extern void refcount_add_checked(int i, refcount_t *r);
-
-extern __must_check bool refcount_inc_not_zero_checked(refcount_t *r);
-extern void refcount_inc_checked(refcount_t *r);
-
-extern __must_check bool refcount_sub_and_test_checked(int i, refcount_t *r);
-
-extern __must_check bool refcount_dec_and_test_checked(refcount_t *r);
-extern void refcount_dec_checked(refcount_t *r);
-
#ifdef CONFIG_REFCOUNT_FULL

#define REFCOUNT_MAX (UINT_MAX - 1)
#define REFCOUNT_SATURATED UINT_MAX

-#define refcount_add_not_zero refcount_add_not_zero_checked
-#define refcount_add refcount_add_checked
+extern __must_check bool refcount_add_not_zero(int i, refcount_t *r);
+extern void refcount_add(int i, refcount_t *r);

-#define refcount_inc_not_zero refcount_inc_not_zero_checked
-#define refcount_inc refcount_inc_checked
+extern __must_check bool refcount_inc_not_zero(refcount_t *r);
+extern void refcount_inc(refcount_t *r);

-#define refcount_sub_and_test refcount_sub_and_test_checked
+extern __must_check bool refcount_sub_and_test(int i, refcount_t *r);

-#define refcount_dec_and_test refcount_dec_and_test_checked
-#define refcount_dec refcount_dec_checked
+extern __must_check bool refcount_dec_and_test(refcount_t *r);
+extern void refcount_dec(refcount_t *r);

#else

diff --git a/lib/refcount.c b/lib/refcount.c
index 719b0bc..a2f6709 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -43,8 +43,10 @@
#include <linux/spinlock.h>
#include <linux/bug.h>

+#ifdef CONFIG_REFCOUNT_FULL
+
/**
- * refcount_add_not_zero_checked - add a value to a refcount unless it is 0
+ * refcount_add_not_zero - add a value to a refcount unless it is 0
* @i: the value to add to the refcount
* @r: the refcount
*
@@ -61,7 +63,7 @@
*
* Return: false if the passed refcount is 0, true otherwise
*/
-bool refcount_add_not_zero_checked(int i, refcount_t *r)
+bool refcount_add_not_zero(int i, refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

@@ -83,10 +85,10 @@ bool refcount_add_not_zero_checked(int i, refcount_t *r)

return true;
}
-EXPORT_SYMBOL(refcount_add_not_zero_checked);
+EXPORT_SYMBOL(refcount_add_not_zero);

/**
- * refcount_add_checked - add a value to a refcount
+ * refcount_add - add a value to a refcount
* @i: the value to add to the refcount
* @r: the refcount
*
@@ -101,14 +103,14 @@ EXPORT_SYMBOL(refcount_add_not_zero_checked);
* cases, refcount_inc(), or one of its variants, should instead be used to
* increment a reference count.
*/
-void refcount_add_checked(int i, refcount_t *r)
+void refcount_add(int i, refcount_t *r)
{
- WARN_ONCE(!refcount_add_not_zero_checked(i, r), "refcount_t: addition on 0; use-after-free.\n");
+ WARN_ONCE(!refcount_add_not_zero(i, r), "refcount_t: addition on 0; use-after-free.\n");
}
-EXPORT_SYMBOL(refcount_add_checked);
+EXPORT_SYMBOL(refcount_add);

/**
- * refcount_inc_not_zero_checked - increment a refcount unless it is 0
+ * refcount_inc_not_zero - increment a refcount unless it is 0
* @r: the refcount to increment
*
* Similar to atomic_inc_not_zero(), but will saturate at REFCOUNT_SATURATED
@@ -120,7 +122,7 @@ EXPORT_SYMBOL(refcount_add_checked);
*
* Return: true if the increment was successful, false otherwise
*/
-bool refcount_inc_not_zero_checked(refcount_t *r)
+bool refcount_inc_not_zero(refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

@@ -140,10 +142,10 @@ bool refcount_inc_not_zero_checked(refcount_t *r)

return true;
}
-EXPORT_SYMBOL(refcount_inc_not_zero_checked);
+EXPORT_SYMBOL(refcount_inc_not_zero);

/**
- * refcount_inc_checked - increment a refcount
+ * refcount_inc - increment a refcount
* @r: the refcount to increment
*
* Similar to atomic_inc(), but will saturate at REFCOUNT_SATURATED and WARN.
@@ -154,14 +156,14 @@ EXPORT_SYMBOL(refcount_inc_not_zero_checked);
* Will WARN if the refcount is 0, as this represents a possible use-after-free
* condition.
*/
-void refcount_inc_checked(refcount_t *r)
+void refcount_inc(refcount_t *r)
{
- WARN_ONCE(!refcount_inc_not_zero_checked(r), "refcount_t: increment on 0; use-after-free.\n");
+ WARN_ONCE(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
}
-EXPORT_SYMBOL(refcount_inc_checked);
+EXPORT_SYMBOL(refcount_inc);

/**
- * refcount_sub_and_test_checked - subtract from a refcount and test if it is 0
+ * refcount_sub_and_test - subtract from a refcount and test if it is 0
* @i: amount to subtract from the refcount
* @r: the refcount
*
@@ -180,7 +182,7 @@ EXPORT_SYMBOL(refcount_inc_checked);
*
* Return: true if the resulting refcount is 0, false otherwise
*/
-bool refcount_sub_and_test_checked(int i, refcount_t *r)
+bool refcount_sub_and_test(int i, refcount_t *r)
{
unsigned int new, val = atomic_read(&r->refs);

@@ -203,10 +205,10 @@ bool refcount_sub_and_test_checked(int i, refcount_t *r)
return false;

}
-EXPORT_SYMBOL(refcount_sub_and_test_checked);
+EXPORT_SYMBOL(refcount_sub_and_test);

/**
- * refcount_dec_and_test_checked - decrement a refcount and test if it is 0
+ * refcount_dec_and_test - decrement a refcount and test if it is 0
* @r: the refcount
*
* Similar to atomic_dec_and_test(), it will WARN on underflow and fail to
@@ -218,14 +220,14 @@ EXPORT_SYMBOL(refcount_sub_and_test_checked);
*
* Return: true if the resulting refcount is 0, false otherwise
*/
-bool refcount_dec_and_test_checked(refcount_t *r)
+bool refcount_dec_and_test(refcount_t *r)
{
- return refcount_sub_and_test_checked(1, r);
+ return refcount_sub_and_test(1, r);
}
-EXPORT_SYMBOL(refcount_dec_and_test_checked);
+EXPORT_SYMBOL(refcount_dec_and_test);

/**
- * refcount_dec_checked - decrement a refcount
+ * refcount_dec - decrement a refcount
* @r: the refcount
*
* Similar to atomic_dec(), it will WARN on underflow and fail to decrement
@@ -234,11 +236,13 @@ EXPORT_SYMBOL(refcount_dec_and_test_checked);
* Provides release memory ordering, such that prior loads and stores are done
* before.
*/
-void refcount_dec_checked(refcount_t *r)
+void refcount_dec(refcount_t *r)
{
- WARN_ONCE(refcount_dec_and_test_checked(r), "refcount_t: decrement hit 0; leaking memory.\n");
+ WARN_ONCE(refcount_dec_and_test(r), "refcount_t: decrement hit 0; leaking memory.\n");
}
-EXPORT_SYMBOL(refcount_dec_checked);
+EXPORT_SYMBOL(refcount_dec);
+
+#endif /* CONFIG_REFCOUNT_FULL */

/**
* refcount_dec_if_one - decrement a refcount if it is 1

2019-11-25 08:22:20

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: locking/core] locking/refcount: Consolidate implementations of refcount_t

The following commit has been merged into the locking/core branch of tip:

Commit-ID: fb041bb7c0a918b95c6889fc965cdc4a75b4c0ca
Gitweb: https://git.kernel.org/tip/fb041bb7c0a918b95c6889fc965cdc4a75b4c0ca
Author: Will Deacon <[email protected]>
AuthorDate: Thu, 21 Nov 2019 11:59:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 25 Nov 2019 09:15:32 +01:00

locking/refcount: Consolidate implementations of refcount_t

The generic implementation of refcount_t should be good enough for
everybody, so remove ARCH_HAS_REFCOUNT and REFCOUNT_FULL entirely,
leaving the generic implementation enabled unconditionally.

Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Acked-by: Kees Cook <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/Kconfig | 21 +----
arch/arm/Kconfig | 1 +-
arch/arm64/Kconfig | 1 +-
arch/s390/configs/debug_defconfig | 1 +-
arch/x86/Kconfig | 1 +-
arch/x86/include/asm/asm.h | 6 +-
arch/x86/include/asm/refcount.h | 126 +----------------------
arch/x86/mm/extable.c | 49 +---------
drivers/gpu/drm/i915/Kconfig.debug | 1 +-
include/linux/refcount.h | 158 ++++++++++------------------
lib/refcount.c | 2 +-
11 files changed, 59 insertions(+), 308 deletions(-)
delete mode 100644 arch/x86/include/asm/refcount.h

diff --git a/arch/Kconfig b/arch/Kconfig
index 5f8a5d8..8bcc1c7 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -892,27 +892,6 @@ config STRICT_MODULE_RWX
config ARCH_HAS_PHYS_TO_DMA
bool

-config ARCH_HAS_REFCOUNT
- bool
- help
- An architecture selects this when it has implemented refcount_t
- using open coded assembly primitives that provide an optimized
- refcount_t implementation, possibly at the expense of some full
- refcount state checks of CONFIG_REFCOUNT_FULL=y.
-
- The refcount overflow check behavior, however, must be retained.
- Catching overflows is the primary security concern for protecting
- against bugs in reference counts.
-
-config REFCOUNT_FULL
- bool "Perform full reference count validation at the expense of speed"
- help
- Enabling this switches the refcounting infrastructure from a fast
- unchecked atomic_t implementation to a fully state checked
- implementation, which can be (slightly) slower but provides protections
- against various use-after-free conditions that can be used in
- security flaw exploits.
-
config HAVE_ARCH_COMPILER_H
bool
help
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 8a50efb..0d3c5d7 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -117,7 +117,6 @@ config ARM
select OLD_SIGSUSPEND3
select PCI_SYSCALL if PCI
select PERF_USE_VMALLOC
- select REFCOUNT_FULL
select RTC_LIB
select SYS_SUPPORTS_APM_EMULATION
# Above selects are sorted alphabetically; please add new ones
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 41a9b42..bc990d3 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -181,7 +181,6 @@ config ARM64
select PCI_SYSCALL if PCI
select POWER_RESET
select POWER_SUPPLY
- select REFCOUNT_FULL
select SPARSE_IRQ
select SWIOTLB
select SYSCTL_EXCEPTION_TRACE
diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig
index 38d6403..2e60c80 100644
--- a/arch/s390/configs/debug_defconfig
+++ b/arch/s390/configs/debug_defconfig
@@ -62,7 +62,6 @@ CONFIG_OPROFILE=m
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
CONFIG_STATIC_KEYS_SELFTEST=y
-CONFIG_REFCOUNT_FULL=y
CONFIG_LOCK_EVENT_COUNTS=y
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d6e1faa..fa6274f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -73,7 +73,6 @@ config X86
select ARCH_HAS_PMEM_API if X86_64
select ARCH_HAS_PTE_DEVMAP if X86_64
select ARCH_HAS_PTE_SPECIAL
- select ARCH_HAS_REFCOUNT
select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64
select ARCH_HAS_UACCESS_MCSAFE if X86_64 && X86_MCE
select ARCH_HAS_SET_MEMORY
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 3ff577c..5a0c14e 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -139,9 +139,6 @@
# define _ASM_EXTABLE_EX(from, to) \
_ASM_EXTABLE_HANDLE(from, to, ex_handler_ext)

-# define _ASM_EXTABLE_REFCOUNT(from, to) \
- _ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount)
-
# define _ASM_NOKPROBE(entry) \
.pushsection "_kprobe_blacklist","aw" ; \
_ASM_ALIGN ; \
@@ -170,9 +167,6 @@
# define _ASM_EXTABLE_EX(from, to) \
_ASM_EXTABLE_HANDLE(from, to, ex_handler_ext)

-# define _ASM_EXTABLE_REFCOUNT(from, to) \
- _ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount)
-
/* For C file, we already have NOKPROBE_SYMBOL macro */
#endif

diff --git a/arch/x86/include/asm/refcount.h b/arch/x86/include/asm/refcount.h
deleted file mode 100644
index 232f856..0000000
--- a/arch/x86/include/asm/refcount.h
+++ /dev/null
@@ -1,126 +0,0 @@
-#ifndef __ASM_X86_REFCOUNT_H
-#define __ASM_X86_REFCOUNT_H
-/*
- * x86-specific implementation of refcount_t. Based on PAX_REFCOUNT from
- * PaX/grsecurity.
- */
-#include <linux/refcount.h>
-#include <asm/bug.h>
-
-/*
- * This is the first portion of the refcount error handling, which lives in
- * .text.unlikely, and is jumped to from the CPU flag check (in the
- * following macros). This saves the refcount value location into CX for
- * the exception handler to use (in mm/extable.c), and then triggers the
- * central refcount exception. The fixup address for the exception points
- * back to the regular execution flow in .text.
- */
-#define _REFCOUNT_EXCEPTION \
- ".pushsection .text..refcount\n" \
- "111:\tlea %[var], %%" _ASM_CX "\n" \
- "112:\t" ASM_UD2 "\n" \
- ASM_UNREACHABLE \
- ".popsection\n" \
- "113:\n" \
- _ASM_EXTABLE_REFCOUNT(112b, 113b)
-
-/* Trigger refcount exception if refcount result is negative. */
-#define REFCOUNT_CHECK_LT_ZERO \
- "js 111f\n\t" \
- _REFCOUNT_EXCEPTION
-
-/* Trigger refcount exception if refcount result is zero or negative. */
-#define REFCOUNT_CHECK_LE_ZERO \
- "jz 111f\n\t" \
- REFCOUNT_CHECK_LT_ZERO
-
-/* Trigger refcount exception unconditionally. */
-#define REFCOUNT_ERROR \
- "jmp 111f\n\t" \
- _REFCOUNT_EXCEPTION
-
-static __always_inline void refcount_add(unsigned int i, refcount_t *r)
-{
- asm volatile(LOCK_PREFIX "addl %1,%0\n\t"
- REFCOUNT_CHECK_LT_ZERO
- : [var] "+m" (r->refs.counter)
- : "ir" (i)
- : "cc", "cx");
-}
-
-static __always_inline void refcount_inc(refcount_t *r)
-{
- asm volatile(LOCK_PREFIX "incl %0\n\t"
- REFCOUNT_CHECK_LT_ZERO
- : [var] "+m" (r->refs.counter)
- : : "cc", "cx");
-}
-
-static __always_inline void refcount_dec(refcount_t *r)
-{
- asm volatile(LOCK_PREFIX "decl %0\n\t"
- REFCOUNT_CHECK_LE_ZERO
- : [var] "+m" (r->refs.counter)
- : : "cc", "cx");
-}
-
-static __always_inline __must_check
-bool refcount_sub_and_test(unsigned int i, refcount_t *r)
-{
- bool ret = GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl",
- REFCOUNT_CHECK_LT_ZERO,
- r->refs.counter, e, "er", i, "cx");
-
- if (ret) {
- smp_acquire__after_ctrl_dep();
- return true;
- }
-
- return false;
-}
-
-static __always_inline __must_check bool refcount_dec_and_test(refcount_t *r)
-{
- bool ret = GEN_UNARY_SUFFIXED_RMWcc(LOCK_PREFIX "decl",
- REFCOUNT_CHECK_LT_ZERO,
- r->refs.counter, e, "cx");
-
- if (ret) {
- smp_acquire__after_ctrl_dep();
- return true;
- }
-
- return false;
-}
-
-static __always_inline __must_check
-bool refcount_add_not_zero(unsigned int i, refcount_t *r)
-{
- int c, result;
-
- c = atomic_read(&(r->refs));
- do {
- if (unlikely(c == 0))
- return false;
-
- result = c + i;
-
- /* Did we try to increment from/to an undesirable state? */
- if (unlikely(c < 0 || c == INT_MAX || result < c)) {
- asm volatile(REFCOUNT_ERROR
- : : [var] "m" (r->refs.counter)
- : "cc", "cx");
- break;
- }
-
- } while (!atomic_try_cmpxchg(&(r->refs), &c, result));
-
- return c != 0;
-}
-
-static __always_inline __must_check bool refcount_inc_not_zero(refcount_t *r)
-{
- return refcount_add_not_zero(1, r);
-}
-
-#endif
diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
index 4d75bc6..30bb0bd 100644
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -45,55 +45,6 @@ __visible bool ex_handler_fault(const struct exception_table_entry *fixup,
EXPORT_SYMBOL_GPL(ex_handler_fault);

/*
- * Handler for UD0 exception following a failed test against the
- * result of a refcount inc/dec/add/sub.
- */
-__visible bool ex_handler_refcount(const struct exception_table_entry *fixup,
- struct pt_regs *regs, int trapnr,
- unsigned long error_code,
- unsigned long fault_addr)
-{
- /* First unconditionally saturate the refcount. */
- *(int *)regs->cx = INT_MIN / 2;
-
- /*
- * Strictly speaking, this reports the fixup destination, not
- * the fault location, and not the actually overflowing
- * instruction, which is the instruction before the "js", but
- * since that instruction could be a variety of lengths, just
- * report the location after the overflow, which should be close
- * enough for finding the overflow, as it's at least back in
- * the function, having returned from .text.unlikely.
- */
- regs->ip = ex_fixup_addr(fixup);
-
- /*
- * This function has been called because either a negative refcount
- * value was seen by any of the refcount functions, or a zero
- * refcount value was seen by refcount_dec().
- *
- * If we crossed from INT_MAX to INT_MIN, OF (Overflow Flag: result
- * wrapped around) will be set. Additionally, seeing the refcount
- * reach 0 will set ZF (Zero Flag: result was zero). In each of
- * these cases we want a report, since it's a boundary condition.
- * The SF case is not reported since it indicates post-boundary
- * manipulations below zero or above INT_MAX. And if none of the
- * flags are set, something has gone very wrong, so report it.
- */
- if (regs->flags & (X86_EFLAGS_OF | X86_EFLAGS_ZF)) {
- bool zero = regs->flags & X86_EFLAGS_ZF;
-
- refcount_error_report(regs, zero ? "hit zero" : "overflow");
- } else if ((regs->flags & X86_EFLAGS_SF) == 0) {
- /* Report if none of OF, ZF, nor SF are set. */
- refcount_error_report(regs, "unexpected saturation");
- }
-
- return true;
-}
-EXPORT_SYMBOL(ex_handler_refcount);
-
-/*
* Handler for when we fail to restore a task's FPU state. We should never get
* here because the FPU state of a task using the FPU (task->thread.fpu.state)
* should always be valid. However, past bugs have allowed userspace to set
diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 00786a1..1400fce 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -22,7 +22,6 @@ config DRM_I915_DEBUG
depends on DRM_I915
select DEBUG_FS
select PREEMPT_COUNT
- select REFCOUNT_FULL
select I2C_CHARDEV
select STACKDEPOT
select DRM_DP_AUX_CHARDEV
diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 757d463..0ac50cf 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -1,64 +1,4 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _LINUX_REFCOUNT_H
-#define _LINUX_REFCOUNT_H
-
-#include <linux/atomic.h>
-#include <linux/compiler.h>
-#include <linux/limits.h>
-#include <linux/spinlock_types.h>
-
-struct mutex;
-
-/**
- * struct refcount_t - variant of atomic_t specialized for reference counts
- * @refs: atomic_t counter field
- *
- * The counter saturates at REFCOUNT_SATURATED and will not move once
- * there. This avoids wrapping the counter and causing 'spurious'
- * use-after-free bugs.
- */
-typedef struct refcount_struct {
- atomic_t refs;
-} refcount_t;
-
-#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }
-#define REFCOUNT_MAX INT_MAX
-#define REFCOUNT_SATURATED (INT_MIN / 2)
-
-enum refcount_saturation_type {
- REFCOUNT_ADD_NOT_ZERO_OVF,
- REFCOUNT_ADD_OVF,
- REFCOUNT_ADD_UAF,
- REFCOUNT_SUB_UAF,
- REFCOUNT_DEC_LEAK,
-};
-
-void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t);
-
-/**
- * refcount_set - set a refcount's value
- * @r: the refcount
- * @n: value to which the refcount will be set
- */
-static inline void refcount_set(refcount_t *r, int n)
-{
- atomic_set(&r->refs, n);
-}
-
-/**
- * refcount_read - get a refcount's value
- * @r: the refcount
- *
- * Return: the refcount's value
- */
-static inline unsigned int refcount_read(const refcount_t *r)
-{
- return atomic_read(&r->refs);
-}
-
-#ifdef CONFIG_REFCOUNT_FULL
-#include <linux/bug.h>
-
/*
* Variant of atomic_t specialized for reference counts.
*
@@ -136,6 +76,64 @@ static inline unsigned int refcount_read(const refcount_t *r)
*
*/

+#ifndef _LINUX_REFCOUNT_H
+#define _LINUX_REFCOUNT_H
+
+#include <linux/atomic.h>
+#include <linux/bug.h>
+#include <linux/compiler.h>
+#include <linux/limits.h>
+#include <linux/spinlock_types.h>
+
+struct mutex;
+
+/**
+ * struct refcount_t - variant of atomic_t specialized for reference counts
+ * @refs: atomic_t counter field
+ *
+ * The counter saturates at REFCOUNT_SATURATED and will not move once
+ * there. This avoids wrapping the counter and causing 'spurious'
+ * use-after-free bugs.
+ */
+typedef struct refcount_struct {
+ atomic_t refs;
+} refcount_t;
+
+#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }
+#define REFCOUNT_MAX INT_MAX
+#define REFCOUNT_SATURATED (INT_MIN / 2)
+
+enum refcount_saturation_type {
+ REFCOUNT_ADD_NOT_ZERO_OVF,
+ REFCOUNT_ADD_OVF,
+ REFCOUNT_ADD_UAF,
+ REFCOUNT_SUB_UAF,
+ REFCOUNT_DEC_LEAK,
+};
+
+void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t);
+
+/**
+ * refcount_set - set a refcount's value
+ * @r: the refcount
+ * @n: value to which the refcount will be set
+ */
+static inline void refcount_set(refcount_t *r, int n)
+{
+ atomic_set(&r->refs, n);
+}
+
+/**
+ * refcount_read - get a refcount's value
+ * @r: the refcount
+ *
+ * Return: the refcount's value
+ */
+static inline unsigned int refcount_read(const refcount_t *r)
+{
+ return atomic_read(&r->refs);
+}
+
/**
* refcount_add_not_zero - add a value to a refcount unless it is 0
* @i: the value to add to the refcount
@@ -298,46 +296,6 @@ static inline void refcount_dec(refcount_t *r)
if (unlikely(atomic_fetch_sub_release(1, &r->refs) <= 1))
refcount_warn_saturate(r, REFCOUNT_DEC_LEAK);
}
-#else /* CONFIG_REFCOUNT_FULL */
-# ifdef CONFIG_ARCH_HAS_REFCOUNT
-# include <asm/refcount.h>
-# else
-static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
-{
- return atomic_add_unless(&r->refs, i, 0);
-}
-
-static inline void refcount_add(int i, refcount_t *r)
-{
- atomic_add(i, &r->refs);
-}
-
-static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
-{
- return atomic_add_unless(&r->refs, 1, 0);
-}
-
-static inline void refcount_inc(refcount_t *r)
-{
- atomic_inc(&r->refs);
-}
-
-static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
-{
- return atomic_sub_and_test(i, &r->refs);
-}
-
-static inline __must_check bool refcount_dec_and_test(refcount_t *r)
-{
- return atomic_dec_and_test(&r->refs);
-}
-
-static inline void refcount_dec(refcount_t *r)
-{
- atomic_dec(&r->refs);
-}
-# endif /* !CONFIG_ARCH_HAS_REFCOUNT */
-#endif /* !CONFIG_REFCOUNT_FULL */

extern __must_check bool refcount_dec_if_one(refcount_t *r);
extern __must_check bool refcount_dec_not_one(refcount_t *r);
diff --git a/lib/refcount.c b/lib/refcount.c
index 8b7e249..ebac8b7 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
/*
- * Out-of-line refcount functions common to all refcount implementations.
+ * Out-of-line refcount functions.
*/

#include <linux/mutex.h>

2019-11-25 08:22:51

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: locking/core] locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 65b008552469f1c37f5e06e0016924502e40b4f5
Gitweb: https://git.kernel.org/tip/65b008552469f1c37f5e06e0016924502e40b4f5
Author: Will Deacon <[email protected]>
AuthorDate: Thu, 21 Nov 2019 11:58:59
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 25 Nov 2019 09:15:27 +01:00

locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions

The definitions of REFCOUNT_MAX and REFCOUNT_SATURATED are the same,
regardless of CONFIG_REFCOUNT_FULL, so consolidate them into a single
pair of definitions.

Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/refcount.h | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index 1cd0a87..757d463 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -22,6 +22,8 @@ typedef struct refcount_struct {
} refcount_t;

#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }
+#define REFCOUNT_MAX INT_MAX
+#define REFCOUNT_SATURATED (INT_MIN / 2)

enum refcount_saturation_type {
REFCOUNT_ADD_NOT_ZERO_OVF,
@@ -57,9 +59,6 @@ static inline unsigned int refcount_read(const refcount_t *r)
#ifdef CONFIG_REFCOUNT_FULL
#include <linux/bug.h>

-#define REFCOUNT_MAX INT_MAX
-#define REFCOUNT_SATURATED (INT_MIN / 2)
-
/*
* Variant of atomic_t specialized for reference counts.
*
@@ -300,10 +299,6 @@ static inline void refcount_dec(refcount_t *r)
refcount_warn_saturate(r, REFCOUNT_DEC_LEAK);
}
#else /* CONFIG_REFCOUNT_FULL */
-
-#define REFCOUNT_MAX INT_MAX
-#define REFCOUNT_SATURATED (INT_MIN / 2)
-
# ifdef CONFIG_ARCH_HAS_REFCOUNT
# include <asm/refcount.h>
# else

2019-11-25 08:23:06

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: locking/core] locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the <linux/refcount.h> header

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 77e9971c79c29542ab7dd4140f9343bf2ff36158
Gitweb: https://git.kernel.org/tip/77e9971c79c29542ab7dd4140f9343bf2ff36158
Author: Will Deacon <[email protected]>
AuthorDate: Thu, 21 Nov 2019 11:58:56
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 25 Nov 2019 09:15:06 +01:00

locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the <linux/refcount.h> header

In an effort to improve performance of the REFCOUNT_FULL implementation,
move the bulk of its functions into linux/refcount.h. This allows them
to be inlined in the same way as if they had been provided via
CONFIG_ARCH_HAS_REFCOUNT.

Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/refcount.h | 237 ++++++++++++++++++++++++++++++++++++--
lib/refcount.c | 238 +--------------------------------------
2 files changed, 229 insertions(+), 246 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index edd505d..e719b5b 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -45,22 +45,241 @@ static inline unsigned int refcount_read(const refcount_t *r)
}

#ifdef CONFIG_REFCOUNT_FULL
+#include <linux/bug.h>

#define REFCOUNT_MAX (UINT_MAX - 1)
#define REFCOUNT_SATURATED UINT_MAX

-extern __must_check bool refcount_add_not_zero(int i, refcount_t *r);
-extern void refcount_add(int i, refcount_t *r);
+/*
+ * Variant of atomic_t specialized for reference counts.
+ *
+ * The interface matches the atomic_t interface (to aid in porting) but only
+ * provides the few functions one should use for reference counting.
+ *
+ * It differs in that the counter saturates at REFCOUNT_SATURATED and will not
+ * move once there. This avoids wrapping the counter and causing 'spurious'
+ * use-after-free issues.
+ *
+ * Memory ordering rules are slightly relaxed wrt regular atomic_t functions
+ * and provide only what is strictly required for refcounts.
+ *
+ * The increments are fully relaxed; these will not provide ordering. The
+ * rationale is that whatever is used to obtain the object we're increasing the
+ * reference count on will provide the ordering. For locked data structures,
+ * its the lock acquire, for RCU/lockless data structures its the dependent
+ * load.
+ *
+ * Do note that inc_not_zero() provides a control dependency which will order
+ * future stores against the inc, this ensures we'll never modify the object
+ * if we did not in fact acquire a reference.
+ *
+ * The decrements will provide release order, such that all the prior loads and
+ * stores will be issued before, it also provides a control dependency, which
+ * will order us against the subsequent free().
+ *
+ * The control dependency is against the load of the cmpxchg (ll/sc) that
+ * succeeded. This means the stores aren't fully ordered, but this is fine
+ * because the 1->0 transition indicates no concurrency.
+ *
+ * Note that the allocator is responsible for ordering things between free()
+ * and alloc().
+ *
+ * The decrements dec_and_test() and sub_and_test() also provide acquire
+ * ordering on success.
+ *
+ */
+
+/**
+ * refcount_add_not_zero - add a value to a refcount unless it is 0
+ * @i: the value to add to the refcount
+ * @r: the refcount
+ *
+ * Will saturate at REFCOUNT_SATURATED and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller has guaranteed the
+ * object memory to be stable (RCU, etc.). It does provide a control dependency
+ * and thereby orders future stores. See the comment on top.
+ *
+ * Use of this function is not recommended for the normal reference counting
+ * use case in which references are taken and released one at a time. In these
+ * cases, refcount_inc(), or one of its variants, should instead be used to
+ * increment a reference count.
+ *
+ * Return: false if the passed refcount is 0, true otherwise
+ */
+static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
+{
+ unsigned int new, val = atomic_read(&r->refs);
+
+ do {
+ if (!val)
+ return false;
+
+ if (unlikely(val == REFCOUNT_SATURATED))
+ return true;
+
+ new = val + i;
+ if (new < val)
+ new = REFCOUNT_SATURATED;
+
+ } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
+
+ WARN_ONCE(new == REFCOUNT_SATURATED,
+ "refcount_t: saturated; leaking memory.\n");
+
+ return true;
+}
+
+/**
+ * refcount_add - add a value to a refcount
+ * @i: the value to add to the refcount
+ * @r: the refcount
+ *
+ * Similar to atomic_add(), but will saturate at REFCOUNT_SATURATED and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller has guaranteed the
+ * object memory to be stable (RCU, etc.). It does provide a control dependency
+ * and thereby orders future stores. See the comment on top.
+ *
+ * Use of this function is not recommended for the normal reference counting
+ * use case in which references are taken and released one at a time. In these
+ * cases, refcount_inc(), or one of its variants, should instead be used to
+ * increment a reference count.
+ */
+static inline void refcount_add(int i, refcount_t *r)
+{
+ WARN_ONCE(!refcount_add_not_zero(i, r), "refcount_t: addition on 0; use-after-free.\n");
+}
+
+/**
+ * refcount_inc_not_zero - increment a refcount unless it is 0
+ * @r: the refcount to increment
+ *
+ * Similar to atomic_inc_not_zero(), but will saturate at REFCOUNT_SATURATED
+ * and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller has guaranteed the
+ * object memory to be stable (RCU, etc.). It does provide a control dependency
+ * and thereby orders future stores. See the comment on top.
+ *
+ * Return: true if the increment was successful, false otherwise
+ */
+static inline __must_check bool refcount_inc_not_zero(refcount_t *r)
+{
+ unsigned int new, val = atomic_read(&r->refs);
+
+ do {
+ new = val + 1;

-extern __must_check bool refcount_inc_not_zero(refcount_t *r);
-extern void refcount_inc(refcount_t *r);
+ if (!val)
+ return false;

-extern __must_check bool refcount_sub_and_test(int i, refcount_t *r);
+ if (unlikely(!new))
+ return true;

-extern __must_check bool refcount_dec_and_test(refcount_t *r);
-extern void refcount_dec(refcount_t *r);
+ } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
+
+ WARN_ONCE(new == REFCOUNT_SATURATED,
+ "refcount_t: saturated; leaking memory.\n");
+
+ return true;
+}
+
+/**
+ * refcount_inc - increment a refcount
+ * @r: the refcount to increment
+ *
+ * Similar to atomic_inc(), but will saturate at REFCOUNT_SATURATED and WARN.
+ *
+ * Provides no memory ordering, it is assumed the caller already has a
+ * reference on the object.
+ *
+ * Will WARN if the refcount is 0, as this represents a possible use-after-free
+ * condition.
+ */
+static inline void refcount_inc(refcount_t *r)
+{
+ WARN_ONCE(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
+}
+
+/**
+ * refcount_sub_and_test - subtract from a refcount and test if it is 0
+ * @i: amount to subtract from the refcount
+ * @r: the refcount
+ *
+ * Similar to atomic_dec_and_test(), but it will WARN, return false and
+ * ultimately leak on underflow and will fail to decrement when saturated
+ * at REFCOUNT_SATURATED.
+ *
+ * Provides release memory ordering, such that prior loads and stores are done
+ * before, and provides an acquire ordering on success such that free()
+ * must come after.
+ *
+ * Use of this function is not recommended for the normal reference counting
+ * use case in which references are taken and released one at a time. In these
+ * cases, refcount_dec(), or one of its variants, should instead be used to
+ * decrement a reference count.
+ *
+ * Return: true if the resulting refcount is 0, false otherwise
+ */
+static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
+{
+ unsigned int new, val = atomic_read(&r->refs);
+
+ do {
+ if (unlikely(val == REFCOUNT_SATURATED))
+ return false;
+
+ new = val - i;
+ if (new > val) {
+ WARN_ONCE(new > val, "refcount_t: underflow; use-after-free.\n");
+ return false;
+ }
+
+ } while (!atomic_try_cmpxchg_release(&r->refs, &val, new));
+
+ if (!new) {
+ smp_acquire__after_ctrl_dep();
+ return true;
+ }
+ return false;
+
+}
+
+/**
+ * refcount_dec_and_test - decrement a refcount and test if it is 0
+ * @r: the refcount
+ *
+ * Similar to atomic_dec_and_test(), it will WARN on underflow and fail to
+ * decrement when saturated at REFCOUNT_SATURATED.
+ *
+ * Provides release memory ordering, such that prior loads and stores are done
+ * before, and provides an acquire ordering on success such that free()
+ * must come after.
+ *
+ * Return: true if the resulting refcount is 0, false otherwise
+ */
+static inline __must_check bool refcount_dec_and_test(refcount_t *r)
+{
+ return refcount_sub_and_test(1, r);
+}
+
+/**
+ * refcount_dec - decrement a refcount
+ * @r: the refcount
+ *
+ * Similar to atomic_dec(), it will WARN on underflow and fail to decrement
+ * when saturated at REFCOUNT_SATURATED.
+ *
+ * Provides release memory ordering, such that prior loads and stores are done
+ * before.
+ */
+static inline void refcount_dec(refcount_t *r)
+{
+ WARN_ONCE(refcount_dec_and_test(r), "refcount_t: decrement hit 0; leaking memory.\n");
+}

-#else
+#else /* CONFIG_REFCOUNT_FULL */

#define REFCOUNT_MAX INT_MAX
#define REFCOUNT_SATURATED (INT_MIN / 2)
@@ -103,7 +322,7 @@ static inline void refcount_dec(refcount_t *r)
atomic_dec(&r->refs);
}
# endif /* !CONFIG_ARCH_HAS_REFCOUNT */
-#endif /* CONFIG_REFCOUNT_FULL */
+#endif /* !CONFIG_REFCOUNT_FULL */

extern __must_check bool refcount_dec_if_one(refcount_t *r);
extern __must_check bool refcount_dec_not_one(refcount_t *r);
diff --git a/lib/refcount.c b/lib/refcount.c
index a2f6709..3a534fb 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -1,41 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
/*
- * Variant of atomic_t specialized for reference counts.
- *
- * The interface matches the atomic_t interface (to aid in porting) but only
- * provides the few functions one should use for reference counting.
- *
- * It differs in that the counter saturates at REFCOUNT_SATURATED and will not
- * move once there. This avoids wrapping the counter and causing 'spurious'
- * use-after-free issues.
- *
- * Memory ordering rules are slightly relaxed wrt regular atomic_t functions
- * and provide only what is strictly required for refcounts.
- *
- * The increments are fully relaxed; these will not provide ordering. The
- * rationale is that whatever is used to obtain the object we're increasing the
- * reference count on will provide the ordering. For locked data structures,
- * its the lock acquire, for RCU/lockless data structures its the dependent
- * load.
- *
- * Do note that inc_not_zero() provides a control dependency which will order
- * future stores against the inc, this ensures we'll never modify the object
- * if we did not in fact acquire a reference.
- *
- * The decrements will provide release order, such that all the prior loads and
- * stores will be issued before, it also provides a control dependency, which
- * will order us against the subsequent free().
- *
- * The control dependency is against the load of the cmpxchg (ll/sc) that
- * succeeded. This means the stores aren't fully ordered, but this is fine
- * because the 1->0 transition indicates no concurrency.
- *
- * Note that the allocator is responsible for ordering things between free()
- * and alloc().
- *
- * The decrements dec_and_test() and sub_and_test() also provide acquire
- * ordering on success.
- *
+ * Out-of-line refcount functions common to all refcount implementations.
*/

#include <linux/mutex.h>
@@ -43,207 +8,6 @@
#include <linux/spinlock.h>
#include <linux/bug.h>

-#ifdef CONFIG_REFCOUNT_FULL
-
-/**
- * refcount_add_not_zero - add a value to a refcount unless it is 0
- * @i: the value to add to the refcount
- * @r: the refcount
- *
- * Will saturate at REFCOUNT_SATURATED and WARN.
- *
- * Provides no memory ordering, it is assumed the caller has guaranteed the
- * object memory to be stable (RCU, etc.). It does provide a control dependency
- * and thereby orders future stores. See the comment on top.
- *
- * Use of this function is not recommended for the normal reference counting
- * use case in which references are taken and released one at a time. In these
- * cases, refcount_inc(), or one of its variants, should instead be used to
- * increment a reference count.
- *
- * Return: false if the passed refcount is 0, true otherwise
- */
-bool refcount_add_not_zero(int i, refcount_t *r)
-{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- if (!val)
- return false;
-
- if (unlikely(val == REFCOUNT_SATURATED))
- return true;
-
- new = val + i;
- if (new < val)
- new = REFCOUNT_SATURATED;
-
- } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
-
- WARN_ONCE(new == REFCOUNT_SATURATED,
- "refcount_t: saturated; leaking memory.\n");
-
- return true;
-}
-EXPORT_SYMBOL(refcount_add_not_zero);
-
-/**
- * refcount_add - add a value to a refcount
- * @i: the value to add to the refcount
- * @r: the refcount
- *
- * Similar to atomic_add(), but will saturate at REFCOUNT_SATURATED and WARN.
- *
- * Provides no memory ordering, it is assumed the caller has guaranteed the
- * object memory to be stable (RCU, etc.). It does provide a control dependency
- * and thereby orders future stores. See the comment on top.
- *
- * Use of this function is not recommended for the normal reference counting
- * use case in which references are taken and released one at a time. In these
- * cases, refcount_inc(), or one of its variants, should instead be used to
- * increment a reference count.
- */
-void refcount_add(int i, refcount_t *r)
-{
- WARN_ONCE(!refcount_add_not_zero(i, r), "refcount_t: addition on 0; use-after-free.\n");
-}
-EXPORT_SYMBOL(refcount_add);
-
-/**
- * refcount_inc_not_zero - increment a refcount unless it is 0
- * @r: the refcount to increment
- *
- * Similar to atomic_inc_not_zero(), but will saturate at REFCOUNT_SATURATED
- * and WARN.
- *
- * Provides no memory ordering, it is assumed the caller has guaranteed the
- * object memory to be stable (RCU, etc.). It does provide a control dependency
- * and thereby orders future stores. See the comment on top.
- *
- * Return: true if the increment was successful, false otherwise
- */
-bool refcount_inc_not_zero(refcount_t *r)
-{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- new = val + 1;
-
- if (!val)
- return false;
-
- if (unlikely(!new))
- return true;
-
- } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
-
- WARN_ONCE(new == REFCOUNT_SATURATED,
- "refcount_t: saturated; leaking memory.\n");
-
- return true;
-}
-EXPORT_SYMBOL(refcount_inc_not_zero);
-
-/**
- * refcount_inc - increment a refcount
- * @r: the refcount to increment
- *
- * Similar to atomic_inc(), but will saturate at REFCOUNT_SATURATED and WARN.
- *
- * Provides no memory ordering, it is assumed the caller already has a
- * reference on the object.
- *
- * Will WARN if the refcount is 0, as this represents a possible use-after-free
- * condition.
- */
-void refcount_inc(refcount_t *r)
-{
- WARN_ONCE(!refcount_inc_not_zero(r), "refcount_t: increment on 0; use-after-free.\n");
-}
-EXPORT_SYMBOL(refcount_inc);
-
-/**
- * refcount_sub_and_test - subtract from a refcount and test if it is 0
- * @i: amount to subtract from the refcount
- * @r: the refcount
- *
- * Similar to atomic_dec_and_test(), but it will WARN, return false and
- * ultimately leak on underflow and will fail to decrement when saturated
- * at REFCOUNT_SATURATED.
- *
- * Provides release memory ordering, such that prior loads and stores are done
- * before, and provides an acquire ordering on success such that free()
- * must come after.
- *
- * Use of this function is not recommended for the normal reference counting
- * use case in which references are taken and released one at a time. In these
- * cases, refcount_dec(), or one of its variants, should instead be used to
- * decrement a reference count.
- *
- * Return: true if the resulting refcount is 0, false otherwise
- */
-bool refcount_sub_and_test(int i, refcount_t *r)
-{
- unsigned int new, val = atomic_read(&r->refs);
-
- do {
- if (unlikely(val == REFCOUNT_SATURATED))
- return false;
-
- new = val - i;
- if (new > val) {
- WARN_ONCE(new > val, "refcount_t: underflow; use-after-free.\n");
- return false;
- }
-
- } while (!atomic_try_cmpxchg_release(&r->refs, &val, new));
-
- if (!new) {
- smp_acquire__after_ctrl_dep();
- return true;
- }
- return false;
-
-}
-EXPORT_SYMBOL(refcount_sub_and_test);
-
-/**
- * refcount_dec_and_test - decrement a refcount and test if it is 0
- * @r: the refcount
- *
- * Similar to atomic_dec_and_test(), it will WARN on underflow and fail to
- * decrement when saturated at REFCOUNT_SATURATED.
- *
- * Provides release memory ordering, such that prior loads and stores are done
- * before, and provides an acquire ordering on success such that free()
- * must come after.
- *
- * Return: true if the resulting refcount is 0, false otherwise
- */
-bool refcount_dec_and_test(refcount_t *r)
-{
- return refcount_sub_and_test(1, r);
-}
-EXPORT_SYMBOL(refcount_dec_and_test);
-
-/**
- * refcount_dec - decrement a refcount
- * @r: the refcount
- *
- * Similar to atomic_dec(), it will WARN on underflow and fail to decrement
- * when saturated at REFCOUNT_SATURATED.
- *
- * Provides release memory ordering, such that prior loads and stores are done
- * before.
- */
-void refcount_dec(refcount_t *r)
-{
- WARN_ONCE(refcount_dec_and_test(r), "refcount_t: decrement hit 0; leaking memory.\n");
-}
-EXPORT_SYMBOL(refcount_dec);
-
-#endif /* CONFIG_REFCOUNT_FULL */
-
/**
* refcount_dec_if_one - decrement a refcount if it is 1
* @r: the refcount

2019-11-25 08:23:56

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: locking/core] locking/refcount: Move saturation warnings out of line

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 1eb085d94256aaa69b00cf5a86e3c5f5bb2bc460
Gitweb: https://git.kernel.org/tip/1eb085d94256aaa69b00cf5a86e3c5f5bb2bc460
Author: Will Deacon <[email protected]>
AuthorDate: Thu, 21 Nov 2019 11:58:58
Committer: Ingo Molnar <[email protected]>
CommitterDate: Mon, 25 Nov 2019 09:15:21 +01:00

locking/refcount: Move saturation warnings out of line

Having the refcount saturation and warnings inline bloats the text,
despite the fact that these paths should never be executed in normal
operation.

Move the refcount saturation and warnings out of line to reduce the
image size when refcount_t checking is enabled. Relative to an x86_64
defconfig, the sizes reported by bloat-o-meter are:

# defconfig+REFCOUNT_FULL, inline saturation (i.e. before this patch)
Total: Before=14762076, After=14915442, chg +1.04%

# defconfig+REFCOUNT_FULL, out-of-line saturation (i.e. after this patch)
Total: Before=14762076, After=14835497, chg +0.50%

A side-effect of this change is that we now only get one warning per
refcount saturation type, rather than one per problematic call-site.

Signed-off-by: Will Deacon <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Hanjun Guo <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Elena Reshetova <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/refcount.h | 39 ++++++++++++++++++++-------------------
lib/refcount.c | 28 ++++++++++++++++++++++++++++-
2 files changed, 48 insertions(+), 19 deletions(-)

diff --git a/include/linux/refcount.h b/include/linux/refcount.h
index e3b218d..1cd0a87 100644
--- a/include/linux/refcount.h
+++ b/include/linux/refcount.h
@@ -23,6 +23,16 @@ typedef struct refcount_struct {

#define REFCOUNT_INIT(n) { .refs = ATOMIC_INIT(n), }

+enum refcount_saturation_type {
+ REFCOUNT_ADD_NOT_ZERO_OVF,
+ REFCOUNT_ADD_OVF,
+ REFCOUNT_ADD_UAF,
+ REFCOUNT_SUB_UAF,
+ REFCOUNT_DEC_LEAK,
+};
+
+void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t);
+
/**
* refcount_set - set a refcount's value
* @r: the refcount
@@ -154,10 +164,8 @@ static inline __must_check bool refcount_add_not_zero(int i, refcount_t *r)
break;
} while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i));

- if (unlikely(old < 0 || old + i < 0)) {
- refcount_set(r, REFCOUNT_SATURATED);
- WARN_ONCE(1, "refcount_t: saturated; leaking memory.\n");
- }
+ if (unlikely(old < 0 || old + i < 0))
+ refcount_warn_saturate(r, REFCOUNT_ADD_NOT_ZERO_OVF);

return old;
}
@@ -182,11 +190,10 @@ static inline void refcount_add(int i, refcount_t *r)
{
int old = atomic_fetch_add_relaxed(i, &r->refs);

- WARN_ONCE(!old, "refcount_t: addition on 0; use-after-free.\n");
- if (unlikely(old <= 0 || old + i <= 0)) {
- refcount_set(r, REFCOUNT_SATURATED);
- WARN_ONCE(old, "refcount_t: saturated; leaking memory.\n");
- }
+ if (unlikely(!old))
+ refcount_warn_saturate(r, REFCOUNT_ADD_UAF);
+ else if (unlikely(old < 0 || old + i < 0))
+ refcount_warn_saturate(r, REFCOUNT_ADD_OVF);
}

/**
@@ -253,10 +260,8 @@ static inline __must_check bool refcount_sub_and_test(int i, refcount_t *r)
return true;
}

- if (unlikely(old < 0 || old - i < 0)) {
- refcount_set(r, REFCOUNT_SATURATED);
- WARN_ONCE(1, "refcount_t: underflow; use-after-free.\n");
- }
+ if (unlikely(old < 0 || old - i < 0))
+ refcount_warn_saturate(r, REFCOUNT_SUB_UAF);

return false;
}
@@ -291,12 +296,8 @@ static inline __must_check bool refcount_dec_and_test(refcount_t *r)
*/
static inline void refcount_dec(refcount_t *r)
{
- int old = atomic_fetch_sub_release(1, &r->refs);
-
- if (unlikely(old <= 1)) {
- refcount_set(r, REFCOUNT_SATURATED);
- WARN_ONCE(1, "refcount_t: decrement hit 0; leaking memory.\n");
- }
+ if (unlikely(atomic_fetch_sub_release(1, &r->refs) <= 1))
+ refcount_warn_saturate(r, REFCOUNT_DEC_LEAK);
}
#else /* CONFIG_REFCOUNT_FULL */

diff --git a/lib/refcount.c b/lib/refcount.c
index 3a534fb..8b7e249 100644
--- a/lib/refcount.c
+++ b/lib/refcount.c
@@ -8,6 +8,34 @@
#include <linux/spinlock.h>
#include <linux/bug.h>

+#define REFCOUNT_WARN(str) WARN_ONCE(1, "refcount_t: " str ".\n")
+
+void refcount_warn_saturate(refcount_t *r, enum refcount_saturation_type t)
+{
+ refcount_set(r, REFCOUNT_SATURATED);
+
+ switch (t) {
+ case REFCOUNT_ADD_NOT_ZERO_OVF:
+ REFCOUNT_WARN("saturated; leaking memory");
+ break;
+ case REFCOUNT_ADD_OVF:
+ REFCOUNT_WARN("saturated; leaking memory");
+ break;
+ case REFCOUNT_ADD_UAF:
+ REFCOUNT_WARN("addition on 0; use-after-free");
+ break;
+ case REFCOUNT_SUB_UAF:
+ REFCOUNT_WARN("underflow; use-after-free");
+ break;
+ case REFCOUNT_DEC_LEAK:
+ REFCOUNT_WARN("decrement hit 0; leaking memory");
+ break;
+ default:
+ REFCOUNT_WARN("unknown saturation event!?");
+ }
+}
+EXPORT_SYMBOL(refcount_warn_saturate);
+
/**
* refcount_dec_if_one - decrement a refcount if it is 1
* @r: the refcount

2019-12-01 15:54:27

by Chen, Rong A

[permalink] [raw]
Subject: [refcount] d2d337b185: WARNING:at_lib/refcount.c:#refcount_warn_saturate

FYI, we noticed the following commit (built with gcc-7):

commit: d2d337b185bd2abff262f3cf7de0080b3888e41c ("[RESEND PATCH v4 08/10] refcount: Consolidate implementations of refcount_t")
url: https://github.com/0day-ci/linux/commits/Will-Deacon/Rework-REFCOUNT_FULL-using-atomic_fetch_-operations/20191124-052413


in testcase: ocfs2test
with following parameters:

disk: 1SSD
test: test-mkfs



on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-----------------------------------------------------------------------------+------------+------------+
| | 2ab80bd4ae | d2d337b185 |
+-----------------------------------------------------------------------------+------------+------------+
| boot_successes | 0 | 0 |
| boot_failures | 24 | 24 |
| BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c | 24 | 24 |
| stack_segment:#[##] | 6 | 6 |
| RIP:__kmalloc | 8 | 6 |
| Kernel_panic-not_syncing:Fatal_exception | 14 | 11 |
| kernel_BUG_at_mm/slub.c | 9 | 4 |
| invalid_opcode:#[##] | 9 | 4 |
| RIP:kfree | 9 | 4 |
| RIP:native_safe_halt | 4 | 1 |
| Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 6 | 2 |
| BUG:unable_to_handle_page_fault_for_address | 5 | 1 |
| Oops:#[##] | 5 | 1 |
| RIP:kmem_cache_alloc_trace | 3 | 3 |
| RIP:idr_get_free | 1 | |
| RIP:console_unlock | 1 | |
| WARNING:at_lib/refcount.c:#refcount_warn_saturate | 0 | 9 |
| RIP:refcount_warn_saturate | 0 | 9 |
| general_protection_fault:#[##] | 0 | 2 |
| RIP:lru_cache_add_active_or_unevictable | 0 | 1 |
+-----------------------------------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 93.421220] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1533
[ 93.423478] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2464, name: mount.ocfs2
[ 93.425258] CPU: 1 PID: 2464 Comm: mount.ocfs2 Not tainted 5.4.0-rc3-00012-gd2d337b185bd2 #1
[ 93.428185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 93.430982] Call Trace:
[ 93.432691] dump_stack+0x5c/0x7b
[ 93.434089] ___might_sleep+0x102/0x120
[ 93.435629] down_write+0x1c/0x50
[ 93.436962] configfs_depend_item+0x3a/0xb0
[ 93.438580] o2hb_region_pin+0xf9/0x180 [ocfs2_nodemanager]
[ 93.440630] ? inode_doinit_with_dentry+0x250/0x4e0
[ 93.441860] o2hb_register_callback+0xc6/0x2a0 [ocfs2_nodemanager]
[ 93.443298] dlm_join_domain+0xbd/0x790 [ocfs2_dlm]
[ 93.444491] ? debugfs_create_dir+0xc4/0x100
[ 93.445635] ? dlm_alloc_ctxt+0x42f/0x560 [ocfs2_dlm]
[ 93.446905] dlm_register_domain+0x31f/0x440 [ocfs2_dlm]
[ 93.448188] ? _cond_resched+0x19/0x30
[ 93.449248] o2cb_cluster_connect+0x132/0x2c0 [ocfs2_stack_o2cb]
[ 93.450701] ocfs2_cluster_connect+0x14b/0x220 [ocfs2_stackglue]
[ 93.452151] ocfs2_dlm_init+0x2e9/0x4b0 [ocfs2]
[ 93.453379] ? ocfs2_init_node_maps+0x50/0x50 [ocfs2]
[ 93.454700] ocfs2_fill_super+0xcf4/0x12a0 [ocfs2]
[ 93.455952] ? ocfs2_initialize_super+0x1030/0x1030 [ocfs2]
[ 93.457484] mount_bdev+0x173/0x1b0
[ 93.458543] legacy_get_tree+0x27/0x40
[ 93.459615] vfs_get_tree+0x25/0xc0
[ 93.460650] do_mount+0x715/0x9a0
[ 93.461697] ksys_mount+0x80/0xd0
[ 93.462728] __x64_sys_mount+0x21/0x30
[ 93.463802] do_syscall_64+0x5b/0x1d0
[ 93.464867] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 93.466224] RIP: 0033:0x7feaa42ee48a
[ 93.467282] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d de f9 2a 00 f7 d8 64 89 01 48
[ 93.472861] RSP: 002b:00007ffec903d4e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 93.475973] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007feaa42ee48a
[ 93.479041] RDX: 0000562b43aae3ee RSI: 0000562b444af0b0 RDI: 0000562b444af310
[ 93.481956] RBP: 00007ffec903d690 R08: 0000562b444af2b0 R09: 0000000000000020
[ 93.485084] R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffec903d580
[ 93.488128] R13: 0000000000000000 R14: 0000562b444b0000 R15: 00007ffec903d500
[ 93.494821] o2dlm: Joining domain 58A89FF3B793441A83C86B6A7816B4AF
[ 93.494822] (
[ 93.499229] 1
[ 93.501238] ) 1 nodes
[ 93.509977] JBD2: Ignoring recovery information on journal
[ 93.518913] ocfs2: Mounting device (8,0) on (node 1, slot 0) with ordered data mode.
[ 94.541086] mount /dev/sda /mnt/ocfs2 /dev/sda 16515072 243712 16271360 2% /mnt/ocfs2
[ 94.541090]
[ 94.546474] OK
[ 94.546476]
[ 94.550726] create testdir /mnt/ocfs2/20191130_125727
[ 94.550729]
[ 94.713069] create 15890 files .
[ 94.713072]
[ 94.717547]
[ 98.760820] o2dlm: Leaving domain 58A89FF3B793441A83C86B6A7816B4AF
[ 98.828610] blk_update_request: I/O error, dev fd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[ 98.830929] floppy: error 10 while reading block 0
[ 99.561301] ocfs2: Unmounting device (8,0) on (node 1)
[ 99.563700] ------------[ cut here ]------------
[ 99.566425] refcount_t: underflow; use-after-free.
[ 99.569213] WARNING: CPU: 1 PID: 2531 at lib/refcount.c:28 refcount_warn_saturate+0x8d/0xf0
[ 99.574231] Modules linked in: ocfs2_stack_o2cb ocfs2_dlm ocfs2 ocfs2_nodemanager ocfs2_stackglue jbd2 sr_mod intel_rapl_msr cdrom ata_generic pata_acpi intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sd_mod sg bochs_drm ppdev drm_vram_helper ttm drm_kms_helper snd_pcm syscopyarea ata_piix sysfillrect aesni_intel sysimgblt snd_timer fb_sys_fops crypto_simd libata drm cryptd glue_helper snd soundcore pcspkr joydev serio_raw virtio_scsi i2c_piix4 floppy parport_pc parport ip_tables
[ 99.593419] CPU: 1 PID: 2531 Comm: umount Tainted: G W 5.4.0-rc3-00012-gd2d337b185bd2 #1
[ 99.597311] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 99.604305] RIP: 0010:refcount_warn_saturate+0x8d/0xf0
[ 99.607073] Code: 05 ce 6f 37 01 01 e8 32 a0 c1 ff 0f 0b c3 80 3d c1 6f 37 01 00 75 ad 48 c7 c7 e0 9f 73 a5 c6 05 b1 6f 37 01 01 e8 13 a0 c1 ff <0f> 0b c3 80 3d a5 6f 37 01 00 75 8e 48 c7 c7 60 9f 73 a5 c6 05 95
[ 99.612363] RSP: 0018:ffffa895c0457e20 EFLAGS: 00010282
[ 99.613931] RAX: 0000000000000000 RBX: ffff8d07e654e000 RCX: 0000000000000000
[ 99.615800] RDX: ffff8d083fd27640 RSI: ffff8d083fd17778 RDI: ffff8d083fd17778
[ 99.617687] RBP: ffff8d07db479000 R08: 000000000000050a R09: 0000000000aaaaaa
[ 99.619574] R10: ffffa895c0457c48 R11: ffff8d0817a4cf20 R12: ffffa895c0457e34
[ 99.621486] R13: ffff8d07e654e240 R14: ffff8d07e654e0c8 R15: 0000000000000000
[ 99.623382] FS: 00007f39e9509e40(0000) GS:ffff8d083fd00000(0000) knlGS:0000000000000000
[ 99.625452] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 99.627106] CR2: 00000000004216d0 CR3: 00000001d2bfe000 CR4: 00000000000406e0
[ 99.628995] Call Trace:
[ 99.630127] ocfs2_dismount_volume+0x32a/0x3e0 [ocfs2]
[ 99.631667] generic_shutdown_super+0x6c/0x120
[ 99.633102] kill_block_super+0x21/0x50
[ 99.634424] deactivate_locked_super+0x3f/0x70
[ 99.635843] cleanup_mnt+0xb8/0x150
[ 99.637130] task_work_run+0xa3/0xe0
[ 99.638407] exit_to_usermode_loop+0xeb/0xf0
[ 99.639827] do_syscall_64+0x1a7/0x1d0
[ 99.641185] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 99.643068] RIP: 0033:0x7f39e8dedd77
[ 99.644352] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f1 00 2b 00 f7 d8 64 89 01 48
[ 99.649122] RSP: 002b:00007fffc63b9bf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[ 99.651318] RAX: 0000000000000000 RBX: 0000557e0db2e080 RCX: 00007f39e8dedd77
[ 99.653282] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000557e0db2e260
[ 99.655401] RBP: 0000557e0db2e260 R08: 0000557e0db2f600 R09: 0000000000000015
[ 99.657366] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f39e92efe64
[ 99.659581] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fffc63b9e80
[ 99.661503] ---[ end trace eed33931ffa30ed0 ]---


To reproduce:

# build kernel
cd linux
cp config-5.4.0-rc3-00012-gd2d337b185bd2 .config
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email



Thanks,
Rong Chen


Attachments:
(No filename) (10.19 kB)
config-5.4.0-rc3-00012-gd2d337b185bd2 (203.80 kB)
job-script (5.14 kB)
dmesg.xz (20.02 kB)
Download all attachments

2019-12-02 09:47:31

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [refcount] d2d337b185: WARNING:at_lib/refcount.c:#refcount_warn_saturate

On Sun, 1 Dec 2019 at 16:49, kernel test robot <[email protected]> wrote:
>
> FYI, we noticed the following commit (built with gcc-7):
>
> commit: d2d337b185bd2abff262f3cf7de0080b3888e41c ("[RESEND PATCH v4 08/10] refcount: Consolidate implementations of refcount_t")
> url: https://github.com/0day-ci/linux/commits/Will-Deacon/Rework-REFCOUNT_FULL-using-atomic_fetch_-operations/20191124-052413
>
>
> in testcase: ocfs2test
> with following parameters:
>
> disk: 1SSD
> test: test-mkfs
>
>
>
> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>
>
> +-----------------------------------------------------------------------------+------------+------------+
> | | 2ab80bd4ae | d2d337b185 |
> +-----------------------------------------------------------------------------+------------+------------+
> | boot_successes | 0 | 0 |
> | boot_failures | 24 | 24 |

So we went from a success rate of 0 out of 24 to 0 out of 24 by
applying that patch. How on earth is that a result that justifies
spamming everybody?


> | BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c | 24 | 24 |
> | stack_segment:#[##] | 6 | 6 |
> | RIP:__kmalloc | 8 | 6 |
> | Kernel_panic-not_syncing:Fatal_exception | 14 | 11 |
> | kernel_BUG_at_mm/slub.c | 9 | 4 |
> | invalid_opcode:#[##] | 9 | 4 |
> | RIP:kfree | 9 | 4 |
> | RIP:native_safe_halt | 4 | 1 |
> | Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 6 | 2 |
> | BUG:unable_to_handle_page_fault_for_address | 5 | 1 |
> | Oops:#[##] | 5 | 1 |
> | RIP:kmem_cache_alloc_trace | 3 | 3 |
> | RIP:idr_get_free | 1 | |
> | RIP:console_unlock | 1 | |
> | WARNING:at_lib/refcount.c:#refcount_warn_saturate | 0 | 9 |
> | RIP:refcount_warn_saturate | 0 | 9 |
> | general_protection_fault:#[##] | 0 | 2 |
> | RIP:lru_cache_add_active_or_unevictable | 0 | 1 |
> +-----------------------------------------------------------------------------+------------+------------+
>
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
>
> [ 93.421220] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1533
> [ 93.423478] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2464, name: mount.ocfs2
> [ 93.425258] CPU: 1 PID: 2464 Comm: mount.ocfs2 Not tainted 5.4.0-rc3-00012-gd2d337b185bd2 #1
> [ 93.428185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 93.430982] Call Trace:
> [ 93.432691] dump_stack+0x5c/0x7b
> [ 93.434089] ___might_sleep+0x102/0x120
> [ 93.435629] down_write+0x1c/0x50
> [ 93.436962] configfs_depend_item+0x3a/0xb0
> [ 93.438580] o2hb_region_pin+0xf9/0x180 [ocfs2_nodemanager]
> [ 93.440630] ? inode_doinit_with_dentry+0x250/0x4e0
> [ 93.441860] o2hb_register_callback+0xc6/0x2a0 [ocfs2_nodemanager]
> [ 93.443298] dlm_join_domain+0xbd/0x790 [ocfs2_dlm]
> [ 93.444491] ? debugfs_create_dir+0xc4/0x100
> [ 93.445635] ? dlm_alloc_ctxt+0x42f/0x560 [ocfs2_dlm]
> [ 93.446905] dlm_register_domain+0x31f/0x440 [ocfs2_dlm]
> [ 93.448188] ? _cond_resched+0x19/0x30
> [ 93.449248] o2cb_cluster_connect+0x132/0x2c0 [ocfs2_stack_o2cb]
> [ 93.450701] ocfs2_cluster_connect+0x14b/0x220 [ocfs2_stackglue]
> [ 93.452151] ocfs2_dlm_init+0x2e9/0x4b0 [ocfs2]
> [ 93.453379] ? ocfs2_init_node_maps+0x50/0x50 [ocfs2]
> [ 93.454700] ocfs2_fill_super+0xcf4/0x12a0 [ocfs2]
> [ 93.455952] ? ocfs2_initialize_super+0x1030/0x1030 [ocfs2]
> [ 93.457484] mount_bdev+0x173/0x1b0
> [ 93.458543] legacy_get_tree+0x27/0x40
> [ 93.459615] vfs_get_tree+0x25/0xc0
> [ 93.460650] do_mount+0x715/0x9a0
> [ 93.461697] ksys_mount+0x80/0xd0
> [ 93.462728] __x64_sys_mount+0x21/0x30
> [ 93.463802] do_syscall_64+0x5b/0x1d0
> [ 93.464867] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 93.466224] RIP: 0033:0x7feaa42ee48a
> [ 93.467282] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d de f9 2a 00 f7 d8 64 89 01 48
> [ 93.472861] RSP: 002b:00007ffec903d4e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
> [ 93.475973] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007feaa42ee48a
> [ 93.479041] RDX: 0000562b43aae3ee RSI: 0000562b444af0b0 RDI: 0000562b444af310
> [ 93.481956] RBP: 00007ffec903d690 R08: 0000562b444af2b0 R09: 0000000000000020
> [ 93.485084] R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffec903d580
> [ 93.488128] R13: 0000000000000000 R14: 0000562b444b0000 R15: 00007ffec903d500
> [ 93.494821] o2dlm: Joining domain 58A89FF3B793441A83C86B6A7816B4AF
> [ 93.494822] (
> [ 93.499229] 1
> [ 93.501238] ) 1 nodes
> [ 93.509977] JBD2: Ignoring recovery information on journal
> [ 93.518913] ocfs2: Mounting device (8,0) on (node 1, slot 0) with ordered data mode.
> [ 94.541086] mount /dev/sda /mnt/ocfs2 /dev/sda 16515072 243712 16271360 2% /mnt/ocfs2
> [ 94.541090]
> [ 94.546474] OK
> [ 94.546476]
> [ 94.550726] create testdir /mnt/ocfs2/20191130_125727
> [ 94.550729]
> [ 94.713069] create 15890 files .
> [ 94.713072]
> [ 94.717547]
> [ 98.760820] o2dlm: Leaving domain 58A89FF3B793441A83C86B6A7816B4AF
> [ 98.828610] blk_update_request: I/O error, dev fd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> [ 98.830929] floppy: error 10 while reading block 0
> [ 99.561301] ocfs2: Unmounting device (8,0) on (node 1)
> [ 99.563700] ------------[ cut here ]------------
> [ 99.566425] refcount_t: underflow; use-after-free.
> [ 99.569213] WARNING: CPU: 1 PID: 2531 at lib/refcount.c:28 refcount_warn_saturate+0x8d/0xf0
> [ 99.574231] Modules linked in: ocfs2_stack_o2cb ocfs2_dlm ocfs2 ocfs2_nodemanager ocfs2_stackglue jbd2 sr_mod intel_rapl_msr cdrom ata_generic pata_acpi intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sd_mod sg bochs_drm ppdev drm_vram_helper ttm drm_kms_helper snd_pcm syscopyarea ata_piix sysfillrect aesni_intel sysimgblt snd_timer fb_sys_fops crypto_simd libata drm cryptd glue_helper snd soundcore pcspkr joydev serio_raw virtio_scsi i2c_piix4 floppy parport_pc parport ip_tables
> [ 99.593419] CPU: 1 PID: 2531 Comm: umount Tainted: G W 5.4.0-rc3-00012-gd2d337b185bd2 #1
> [ 99.597311] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 99.604305] RIP: 0010:refcount_warn_saturate+0x8d/0xf0
> [ 99.607073] Code: 05 ce 6f 37 01 01 e8 32 a0 c1 ff 0f 0b c3 80 3d c1 6f 37 01 00 75 ad 48 c7 c7 e0 9f 73 a5 c6 05 b1 6f 37 01 01 e8 13 a0 c1 ff <0f> 0b c3 80 3d a5 6f 37 01 00 75 8e 48 c7 c7 60 9f 73 a5 c6 05 95
> [ 99.612363] RSP: 0018:ffffa895c0457e20 EFLAGS: 00010282
> [ 99.613931] RAX: 0000000000000000 RBX: ffff8d07e654e000 RCX: 0000000000000000
> [ 99.615800] RDX: ffff8d083fd27640 RSI: ffff8d083fd17778 RDI: ffff8d083fd17778
> [ 99.617687] RBP: ffff8d07db479000 R08: 000000000000050a R09: 0000000000aaaaaa
> [ 99.619574] R10: ffffa895c0457c48 R11: ffff8d0817a4cf20 R12: ffffa895c0457e34
> [ 99.621486] R13: ffff8d07e654e240 R14: ffff8d07e654e0c8 R15: 0000000000000000
> [ 99.623382] FS: 00007f39e9509e40(0000) GS:ffff8d083fd00000(0000) knlGS:0000000000000000
> [ 99.625452] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 99.627106] CR2: 00000000004216d0 CR3: 00000001d2bfe000 CR4: 00000000000406e0
> [ 99.628995] Call Trace:
> [ 99.630127] ocfs2_dismount_volume+0x32a/0x3e0 [ocfs2]
> [ 99.631667] generic_shutdown_super+0x6c/0x120
> [ 99.633102] kill_block_super+0x21/0x50
> [ 99.634424] deactivate_locked_super+0x3f/0x70
> [ 99.635843] cleanup_mnt+0xb8/0x150
> [ 99.637130] task_work_run+0xa3/0xe0
> [ 99.638407] exit_to_usermode_loop+0xeb/0xf0
> [ 99.639827] do_syscall_64+0x1a7/0x1d0
> [ 99.641185] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 99.643068] RIP: 0033:0x7f39e8dedd77
> [ 99.644352] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f1 00 2b 00 f7 d8 64 89 01 48
> [ 99.649122] RSP: 002b:00007fffc63b9bf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> [ 99.651318] RAX: 0000000000000000 RBX: 0000557e0db2e080 RCX: 00007f39e8dedd77
> [ 99.653282] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000557e0db2e260
> [ 99.655401] RBP: 0000557e0db2e260 R08: 0000557e0db2f600 R09: 0000000000000015
> [ 99.657366] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f39e92efe64
> [ 99.659581] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fffc63b9e80
> [ 99.661503] ---[ end trace eed33931ffa30ed0 ]---
>
>
> To reproduce:
>
> # build kernel
> cd linux
> cp config-5.4.0-rc3-00012-gd2d337b185bd2 .config
> make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
> make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
> cd <mod-install-dir>
> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
>
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
>
>
>
> Thanks,
> Rong Chen
>

2019-12-03 08:03:16

by Chen, Rong A

[permalink] [raw]
Subject: Re: [LKP] Re: [refcount] d2d337b185: WARNING:at_lib/refcount.c:#refcount_warn_saturate



On 12/2/19 5:43 PM, Ard Biesheuvel wrote:
> On Sun, 1 Dec 2019 at 16:49, kernel test robot <[email protected]> wrote:
>> FYI, we noticed the following commit (built with gcc-7):
>>
>> commit: d2d337b185bd2abff262f3cf7de0080b3888e41c ("[RESEND PATCH v4 08/10] refcount: Consolidate implementations of refcount_t")
>> url: https://github.com/0day-ci/linux/commits/Will-Deacon/Rework-REFCOUNT_FULL-using-atomic_fetch_-operations/20191124-052413
>>
>>
>> in testcase: ocfs2test
>> with following parameters:
>>
>> disk: 1SSD
>> test: test-mkfs
>>
>>
>>
>> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
>>
>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>
>>
>> +-----------------------------------------------------------------------------+------------+------------+
>> | | 2ab80bd4ae | d2d337b185 |
>> +-----------------------------------------------------------------------------+------------+------------+
>> | boot_successes | 0 | 0 |
>> | boot_failures | 24 | 24 |
> So we went from a success rate of 0 out of 24 to 0 out of 24 by
> applying that patch. How on earth is that a result that justifies
> spamming everybody?

Hi Ard,

These failures were triggered by ocfs2test test, and all tests were
failed include parent commit "2ab80bd4ae".

>
>> | BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c | 24 | 24 |
>> | stack_segment:#[##] | 6 | 6 |
>> | RIP:__kmalloc | 8 | 6 |
>> | Kernel_panic-not_syncing:Fatal_exception | 14 | 11 |
>> | kernel_BUG_at_mm/slub.c | 9 | 4 |
>> | invalid_opcode:#[##] | 9 | 4 |
>> | RIP:kfree | 9 | 4 |
>> | RIP:native_safe_halt | 4 | 1 |
>> | Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 6 | 2 |
>> | BUG:unable_to_handle_page_fault_for_address | 5 | 1 |
>> | Oops:#[##] | 5 | 1 |
>> | RIP:kmem_cache_alloc_trace | 3 | 3 |
>> | RIP:idr_get_free | 1 | |
>> | RIP:console_unlock | 1 | |
>> | WARNING:at_lib/refcount.c:#refcount_warn_saturate | 0 | 9 |
>> | RIP:refcount_warn_saturate | 0 | 9 |

The difference is that commit "d2d337b185" introduces new errors. e.g.
RIP:refcount_warn_saturate
In addition,  you can get the dmesg file in previous report email.

Please feel free to contact us if you need more information, and please
forgive us if we got something
wrong.

Best Regards,
Rong Chen

>> | general_protection_fault:#[##] | 0 | 2 |
>> | RIP:lru_cache_add_active_or_unevictable | 0 | 1 |
>> +-----------------------------------------------------------------------------+------------+------------+
>>
>>
>> If you fix the issue, kindly add following tag
>> Reported-by: kernel test robot <[email protected]>
>>
>>
>> [ 93.421220] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1533
>> [ 93.423478] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2464, name: mount.ocfs2
>> [ 93.425258] CPU: 1 PID: 2464 Comm: mount.ocfs2 Not tainted 5.4.0-rc3-00012-gd2d337b185bd2 #1
>> [ 93.428185] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 93.430982] Call Trace:
>> [ 93.432691] dump_stack+0x5c/0x7b
>> [ 93.434089] ___might_sleep+0x102/0x120
>> [ 93.435629] down_write+0x1c/0x50
>> [ 93.436962] configfs_depend_item+0x3a/0xb0
>> [ 93.438580] o2hb_region_pin+0xf9/0x180 [ocfs2_nodemanager]
>> [ 93.440630] ? inode_doinit_with_dentry+0x250/0x4e0
>> [ 93.441860] o2hb_register_callback+0xc6/0x2a0 [ocfs2_nodemanager]
>> [ 93.443298] dlm_join_domain+0xbd/0x790 [ocfs2_dlm]
>> [ 93.444491] ? debugfs_create_dir+0xc4/0x100
>> [ 93.445635] ? dlm_alloc_ctxt+0x42f/0x560 [ocfs2_dlm]
>> [ 93.446905] dlm_register_domain+0x31f/0x440 [ocfs2_dlm]
>> [ 93.448188] ? _cond_resched+0x19/0x30
>> [ 93.449248] o2cb_cluster_connect+0x132/0x2c0 [ocfs2_stack_o2cb]
>> [ 93.450701] ocfs2_cluster_connect+0x14b/0x220 [ocfs2_stackglue]
>> [ 93.452151] ocfs2_dlm_init+0x2e9/0x4b0 [ocfs2]
>> [ 93.453379] ? ocfs2_init_node_maps+0x50/0x50 [ocfs2]
>> [ 93.454700] ocfs2_fill_super+0xcf4/0x12a0 [ocfs2]
>> [ 93.455952] ? ocfs2_initialize_super+0x1030/0x1030 [ocfs2]
>> [ 93.457484] mount_bdev+0x173/0x1b0
>> [ 93.458543] legacy_get_tree+0x27/0x40
>> [ 93.459615] vfs_get_tree+0x25/0xc0
>> [ 93.460650] do_mount+0x715/0x9a0
>> [ 93.461697] ksys_mount+0x80/0xd0
>> [ 93.462728] __x64_sys_mount+0x21/0x30
>> [ 93.463802] do_syscall_64+0x5b/0x1d0
>> [ 93.464867] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [ 93.466224] RIP: 0033:0x7feaa42ee48a
>> [ 93.467282] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d de f9 2a 00 f7 d8 64 89 01 48
>> [ 93.472861] RSP: 002b:00007ffec903d4e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
>> [ 93.475973] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007feaa42ee48a
>> [ 93.479041] RDX: 0000562b43aae3ee RSI: 0000562b444af0b0 RDI: 0000562b444af310
>> [ 93.481956] RBP: 00007ffec903d690 R08: 0000562b444af2b0 R09: 0000000000000020
>> [ 93.485084] R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffec903d580
>> [ 93.488128] R13: 0000000000000000 R14: 0000562b444b0000 R15: 00007ffec903d500
>> [ 93.494821] o2dlm: Joining domain 58A89FF3B793441A83C86B6A7816B4AF
>> [ 93.494822] (
>> [ 93.499229] 1
>> [ 93.501238] ) 1 nodes
>> [ 93.509977] JBD2: Ignoring recovery information on journal
>> [ 93.518913] ocfs2: Mounting device (8,0) on (node 1, slot 0) with ordered data mode.
>> [ 94.541086] mount /dev/sda /mnt/ocfs2 /dev/sda 16515072 243712 16271360 2% /mnt/ocfs2
>> [ 94.541090]
>> [ 94.546474] OK
>> [ 94.546476]
>> [ 94.550726] create testdir /mnt/ocfs2/20191130_125727
>> [ 94.550729]
>> [ 94.713069] create 15890 files .
>> [ 94.713072]
>> [ 94.717547]
>> [ 98.760820] o2dlm: Leaving domain 58A89FF3B793441A83C86B6A7816B4AF
>> [ 98.828610] blk_update_request: I/O error, dev fd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
>> [ 98.830929] floppy: error 10 while reading block 0
>> [ 99.561301] ocfs2: Unmounting device (8,0) on (node 1)
>> [ 99.563700] ------------[ cut here ]------------
>> [ 99.566425] refcount_t: underflow; use-after-free.
>> [ 99.569213] WARNING: CPU: 1 PID: 2531 at lib/refcount.c:28 refcount_warn_saturate+0x8d/0xf0
>> [ 99.574231] Modules linked in: ocfs2_stack_o2cb ocfs2_dlm ocfs2 ocfs2_nodemanager ocfs2_stackglue jbd2 sr_mod intel_rapl_msr cdrom ata_generic pata_acpi intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sd_mod sg bochs_drm ppdev drm_vram_helper ttm drm_kms_helper snd_pcm syscopyarea ata_piix sysfillrect aesni_intel sysimgblt snd_timer fb_sys_fops crypto_simd libata drm cryptd glue_helper snd soundcore pcspkr joydev serio_raw virtio_scsi i2c_piix4 floppy parport_pc parport ip_tables
>> [ 99.593419] CPU: 1 PID: 2531 Comm: umount Tainted: G W 5.4.0-rc3-00012-gd2d337b185bd2 #1
>> [ 99.597311] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>> [ 99.604305] RIP: 0010:refcount_warn_saturate+0x8d/0xf0
>> [ 99.607073] Code: 05 ce 6f 37 01 01 e8 32 a0 c1 ff 0f 0b c3 80 3d c1 6f 37 01 00 75 ad 48 c7 c7 e0 9f 73 a5 c6 05 b1 6f 37 01 01 e8 13 a0 c1 ff <0f> 0b c3 80 3d a5 6f 37 01 00 75 8e 48 c7 c7 60 9f 73 a5 c6 05 95
>> [ 99.612363] RSP: 0018:ffffa895c0457e20 EFLAGS: 00010282
>> [ 99.613931] RAX: 0000000000000000 RBX: ffff8d07e654e000 RCX: 0000000000000000
>> [ 99.615800] RDX: ffff8d083fd27640 RSI: ffff8d083fd17778 RDI: ffff8d083fd17778
>> [ 99.617687] RBP: ffff8d07db479000 R08: 000000000000050a R09: 0000000000aaaaaa
>> [ 99.619574] R10: ffffa895c0457c48 R11: ffff8d0817a4cf20 R12: ffffa895c0457e34
>> [ 99.621486] R13: ffff8d07e654e240 R14: ffff8d07e654e0c8 R15: 0000000000000000
>> [ 99.623382] FS: 00007f39e9509e40(0000) GS:ffff8d083fd00000(0000) knlGS:0000000000000000
>> [ 99.625452] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 99.627106] CR2: 00000000004216d0 CR3: 00000001d2bfe000 CR4: 00000000000406e0
>> [ 99.628995] Call Trace:
>> [ 99.630127] ocfs2_dismount_volume+0x32a/0x3e0 [ocfs2]
>> [ 99.631667] generic_shutdown_super+0x6c/0x120
>> [ 99.633102] kill_block_super+0x21/0x50
>> [ 99.634424] deactivate_locked_super+0x3f/0x70
>> [ 99.635843] cleanup_mnt+0xb8/0x150
>> [ 99.637130] task_work_run+0xa3/0xe0
>> [ 99.638407] exit_to_usermode_loop+0xeb/0xf0
>> [ 99.639827] do_syscall_64+0x1a7/0x1d0
>> [ 99.641185] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> [ 99.643068] RIP: 0033:0x7f39e8dedd77
>> [ 99.644352] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f1 00 2b 00 f7 d8 64 89 01 48
>> [ 99.649122] RSP: 002b:00007fffc63b9bf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>> [ 99.651318] RAX: 0000000000000000 RBX: 0000557e0db2e080 RCX: 00007f39e8dedd77
>> [ 99.653282] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000557e0db2e260
>> [ 99.655401] RBP: 0000557e0db2e260 R08: 0000557e0db2f600 R09: 0000000000000015
>> [ 99.657366] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f39e92efe64
>> [ 99.659581] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fffc63b9e80
>> [ 99.661503] ---[ end trace eed33931ffa30ed0 ]---
>>
>>
>> To reproduce:
>>
>> # build kernel
>> cd linux
>> cp config-5.4.0-rc3-00012-gd2d337b185bd2 .config
>> make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
>> make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
>> cd <mod-install-dir>
>> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
>>
>>
>> git clone https://github.com/intel/lkp-tests.git
>> cd lkp-tests
>> bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
>>
>>
>>
>> Thanks,
>> Rong Chen
>>
> _______________________________________________
> LKP mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

2019-12-03 08:10:28

by Will Deacon

[permalink] [raw]
Subject: Re: [LKP] Re: [refcount] d2d337b185: WARNING:at_lib/refcount.c:#refcount_warn_saturate

On Tue, Dec 03, 2019 at 04:01:46PM +0800, Rong Chen wrote:
> On 12/2/19 5:43 PM, Ard Biesheuvel wrote:
> > On Sun, 1 Dec 2019 at 16:49, kernel test robot <[email protected]> wrote:
> > > FYI, we noticed the following commit (built with gcc-7):
> > >
> > > commit: d2d337b185bd2abff262f3cf7de0080b3888e41c ("[RESEND PATCH v4
> > > 08/10] refcount: Consolidate implementations of refcount_t")
> > > url:
> > > https://github.com/0day-ci/linux/commits/Will-Deacon/Rework-REFCOUNT_FULL-using-atomic_fetch_-operations/20191124-052413
> > >
> > >
> > > in testcase: ocfs2test
> > > with following parameters:
> > >
> > > disk: 1SSD
> > > test: test-mkfs
> > >
> > So we went from a success rate of 0 out of 24 to 0 out of 24 by
> > applying that patch. How on earth is that a result that justifies
> > spamming everybody?
>
> These failures were triggered by ocfs2test test, and all tests were failed
> include parent commit "2ab80bd4ae".

https://lore.kernel.org/lkml/20191119182745.GA11397@willie-the-truck
https://lore.kernel.org/lkml/20190912105640.2l6mtdjmcyyhmyun@willie-the-truck/

The refcount code is doing its job afaict and its the ocfs2 code at fault.

Will

2019-12-03 08:44:16

by Chen, Rong A

[permalink] [raw]
Subject: Re: [LKP] Re: [refcount] d2d337b185: WARNING:at_lib/refcount.c:#refcount_warn_saturate



On 12/3/19 4:09 PM, Will Deacon wrote:
> On Tue, Dec 03, 2019 at 04:01:46PM +0800, Rong Chen wrote:
>> On 12/2/19 5:43 PM, Ard Biesheuvel wrote:
>>> On Sun, 1 Dec 2019 at 16:49, kernel test robot <[email protected]> wrote:
>>>> FYI, we noticed the following commit (built with gcc-7):
>>>>
>>>> commit: d2d337b185bd2abff262f3cf7de0080b3888e41c ("[RESEND PATCH v4
>>>> 08/10] refcount: Consolidate implementations of refcount_t")
>>>> url:
>>>> https://github.com/0day-ci/linux/commits/Will-Deacon/Rework-REFCOUNT_FULL-using-atomic_fetch_-operations/20191124-052413
>>>>
>>>>
>>>> in testcase: ocfs2test
>>>> with following parameters:
>>>>
>>>> disk: 1SSD
>>>> test: test-mkfs
>>>>
>>> So we went from a success rate of 0 out of 24 to 0 out of 24 by
>>> applying that patch. How on earth is that a result that justifies
>>> spamming everybody?
>> These failures were triggered by ocfs2test test, and all tests were failed
>> include parent commit "2ab80bd4ae".
> https://lore.kernel.org/lkml/20191119182745.GA11397@willie-the-truck
> https://lore.kernel.org/lkml/20190912105640.2l6mtdjmcyyhmyun@willie-the-truck/
>
> The refcount code is doing its job afaict and its the ocfs2 code at fault.
>
> Will

Hi Will,

Thanks for your explanation, we'll disable the test.

Best Regards,
Rong Chen

2020-02-26 04:12:38

by Jann Horn

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 05/10] lib/refcount: Improve performance of generic REFCOUNT_FULL code

On Thu, Nov 21, 2019 at 12:58 PM Will Deacon <[email protected]> wrote:
> Rewrite the generic REFCOUNT_FULL implementation so that the saturation
> point is moved to INT_MIN / 2. This allows us to defer the sanity checks
> until after the atomic operation, which removes many uses of cmpxchg()
> in favour of atomic_fetch_{add,sub}().

Oh, I never saw this, this is really neat! CCing the kernel-hardening
list on this might've been a good idea.

> + * Saturation semantics
> + * ====================
> + *
> + * refcount_t differs from atomic_t in that the counter saturates at
> + * REFCOUNT_SATURATED and will not move once there. This avoids wrapping the
> + * counter and causing 'spurious' use-after-free issues. In order to avoid the
> + * cost associated with introducing cmpxchg() loops into all of the saturating
> + * operations, we temporarily allow the counter to take on an unchecked value
> + * and then explicitly set it to REFCOUNT_SATURATED on detecting that underflow
> + * or overflow has occurred. Although this is racy when multiple threads
> + * access the refcount concurrently, by placing REFCOUNT_SATURATED roughly
> + * equidistant from 0 and INT_MAX we minimise the scope for error:
> + *
> + * INT_MAX REFCOUNT_SATURATED UINT_MAX
> + * 0 (0x7fff_ffff) (0xc000_0000) (0xffff_ffff)
> + * +--------------------------------+----------------+----------------+
> + * <---------- bad value! ---------->
> + *
> + * (in a signed view of the world, the "bad value" range corresponds to
> + * a negative counter value).
[...]
> + * If another thread also performs a refcount_inc() operation between the two
> + * atomic operations, then the count will continue to edge closer to 0. If it
> + * reaches a value of 1 before /any/ of the threads reset it to the saturated
> + * value, then a concurrent refcount_dec_and_test() may erroneously free the
> + * underlying object. Given the precise timing details involved with the
> + * round-robin scheduling of each thread manipulating the refcount and the need
> + * to hit the race multiple times in succession, there doesn't appear to be a
> + * practical avenue of attack even if using refcount_add() operations with
> + * larger increments.

On top of that, the number of threads that can actually be running at
a given time is capped. See include/linux/threads.h, where it is
capped to pow(2, 22):

/*
* A maximum of 4 million PIDs should be enough for a while.
* [NOTE: PID/TIDs are limited to 2^29 ~= 500+ million, see futex.h.]
*/
#define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \
(sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT))

And in the futex UAPI header, we have this, baking a TID limit into
the userspace API (note that this is pow(2,30), not pow(2,29) as the
comment in threads.h claims - I'm not sure where that difference comes
from):

/*
* The rest of the robust-futex field is for the TID:
*/
#define FUTEX_TID_MASK 0x3fffffff

So AFAICS, with the current PID_MAX_LIMIT, if you assume that all
participating refcount operations are non-batched (delta 1) and the
attacker can't cause the threads to oops in the middle of the refcount
operation (maybe that would be possible if you managed to find
something like a NULL pointer dereference in perf software event code
and had perf paranoia at <=1 , or something like that? - I'm not
sure), then even in a theoretical scenario where an attacker spawns
the maximum number of tasks possible and manages to get all of them to
sequentially preempt while being in the middle of increment operations
in several nested contexts (I'm not sure whether that can even happen
- you're not going to take typical sleeping exceptions like page
faults in the middle of a refcount op), the attacker will stay
comfortably inside the saturated range. Even if the PID_MAX_LIMIT is
at some point raised to the maximum permitted by the futex UAPI, this
still holds as long as you assume no nesting. Hm, should I send a
patch to add something like this to the explanatory comment?


Of course, if someone uses refcount batching with sufficiently large
values, those guarantees go out of the window - if we wanted to be
perfectionist about this, we could make the batched operations do slow
cmpxchg stuff while letting the much more performance-critical
single-reference case continue to use the fast saturation scheme.
OTOH, the networking folks would probably hate that, since they're
using the batched ops for ->sk_wmem_alloc stuff, where they count
bytes as references? So I guess maybe we should leave it as-is.

2020-02-28 10:44:26

by Will Deacon

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 05/10] lib/refcount: Improve performance of generic REFCOUNT_FULL code

Hi Jann,

On Wed, Feb 26, 2020 at 05:10:46AM +0100, Jann Horn wrote:
> On Thu, Nov 21, 2019 at 12:58 PM Will Deacon <[email protected]> wrote:
> > Rewrite the generic REFCOUNT_FULL implementation so that the saturation
> > point is moved to INT_MIN / 2. This allows us to defer the sanity checks
> > until after the atomic operation, which removes many uses of cmpxchg()
> > in favour of atomic_fetch_{add,sub}().
>
> Oh, I never saw this, this is really neat! CCing the kernel-hardening
> list on this might've been a good idea.

Glad you like it! I'll try to remember to cc them in future but
get_maintainer.pl doesn't tend to mention them very often.

> > + * If another thread also performs a refcount_inc() operation between the two
> > + * atomic operations, then the count will continue to edge closer to 0. If it
> > + * reaches a value of 1 before /any/ of the threads reset it to the saturated
> > + * value, then a concurrent refcount_dec_and_test() may erroneously free the
> > + * underlying object. Given the precise timing details involved with the
> > + * round-robin scheduling of each thread manipulating the refcount and the need
> > + * to hit the race multiple times in succession, there doesn't appear to be a
> > + * practical avenue of attack even if using refcount_add() operations with
> > + * larger increments.
>
> On top of that, the number of threads that can actually be running at
> a given time is capped. See include/linux/threads.h, where it is
> capped to pow(2, 22):
>
> /*
> * A maximum of 4 million PIDs should be enough for a while.
> * [NOTE: PID/TIDs are limited to 2^29 ~= 500+ million, see futex.h.]
> */
> #define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \
> (sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT))
>
> And in the futex UAPI header, we have this, baking a TID limit into
> the userspace API (note that this is pow(2,30), not pow(2,29) as the
> comment in threads.h claims - I'm not sure where that difference comes
> from):
>
> /*
> * The rest of the robust-futex field is for the TID:
> */
> #define FUTEX_TID_MASK 0x3fffffff
>
> So AFAICS, with the current PID_MAX_LIMIT, if you assume that all
> participating refcount operations are non-batched (delta 1) and the
> attacker can't cause the threads to oops in the middle of the refcount
> operation (maybe that would be possible if you managed to find
> something like a NULL pointer dereference in perf software event code
> and had perf paranoia at <=1 , or something like that? - I'm not
> sure), then even in a theoretical scenario where an attacker spawns
> the maximum number of tasks possible and manages to get all of them to
> sequentially preempt while being in the middle of increment operations
> in several nested contexts (I'm not sure whether that can even happen
> - you're not going to take typical sleeping exceptions like page
> faults in the middle of a refcount op), the attacker will stay
> comfortably inside the saturated range. Even if the PID_MAX_LIMIT is
> at some point raised to the maximum permitted by the futex UAPI, this
> still holds as long as you assume no nesting. Hm, should I send a
> patch to add something like this to the explanatory comment?

Sure, I'd be happy to improve the document by adding this -- please send
out a patch for review. It's probably also worth mentioning the batching
use-cases, although I struggle to reason about the window between the
{under,over}flow occuring and saturation. The idea of oopsing during that
period is interesting, although at least that's not silent (and I think
Android usually (always?) sets panic_on_oops to 1.

> Of course, if someone uses refcount batching with sufficiently large
> values, those guarantees go out of the window - if we wanted to be
> perfectionist about this, we could make the batched operations do slow
> cmpxchg stuff while letting the much more performance-critical
> single-reference case continue to use the fast saturation scheme.
> OTOH, the networking folks would probably hate that, since they're
> using the batched ops for ->sk_wmem_alloc stuff, where they count
> bytes as references? So I guess maybe we should leave it as-is.

Agreed. If we hamper the performance here, then people will either look
to disable the checking or they will switch to atomic_t, which puts us
back to square one. Perfection is the enemy of the good and all that.

Having said that, I'd still be keen to learn about any practical attacks
on this in case there is something smart we can do to mitigate them
without cmpxchg(). For example, one silly approach might be to bound the
maximum increment and split larger ones up using a loop.

Will

2020-03-02 13:07:25

by Jann Horn

[permalink] [raw]
Subject: Re: [RESEND PATCH v4 05/10] lib/refcount: Improve performance of generic REFCOUNT_FULL code

On Fri, Feb 28, 2020 at 11:43 AM Will Deacon <[email protected]> wrote:
> On Wed, Feb 26, 2020 at 05:10:46AM +0100, Jann Horn wrote:
> > On Thu, Nov 21, 2019 at 12:58 PM Will Deacon <[email protected]> wrote:
> > > + * If another thread also performs a refcount_inc() operation between the two
> > > + * atomic operations, then the count will continue to edge closer to 0. If it
> > > + * reaches a value of 1 before /any/ of the threads reset it to the saturated
> > > + * value, then a concurrent refcount_dec_and_test() may erroneously free the
> > > + * underlying object. Given the precise timing details involved with the
> > > + * round-robin scheduling of each thread manipulating the refcount and the need
> > > + * to hit the race multiple times in succession, there doesn't appear to be a
> > > + * practical avenue of attack even if using refcount_add() operations with
> > > + * larger increments.
> >
> > On top of that, the number of threads that can actually be running at
> > a given time is capped. See include/linux/threads.h, where it is
> > capped to pow(2, 22):
> >
> > /*
> > * A maximum of 4 million PIDs should be enough for a while.
> > * [NOTE: PID/TIDs are limited to 2^29 ~= 500+ million, see futex.h.]
> > */
> > #define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \
> > (sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT))
> >
> > And in the futex UAPI header, we have this, baking a TID limit into
> > the userspace API (note that this is pow(2,30), not pow(2,29) as the
> > comment in threads.h claims - I'm not sure where that difference comes
> > from):
> >
> > /*
> > * The rest of the robust-futex field is for the TID:
> > */
> > #define FUTEX_TID_MASK 0x3fffffff
> >
> > So AFAICS, with the current PID_MAX_LIMIT, if you assume that all
> > participating refcount operations are non-batched (delta 1) and the
> > attacker can't cause the threads to oops in the middle of the refcount
> > operation (maybe that would be possible if you managed to find
> > something like a NULL pointer dereference in perf software event code
> > and had perf paranoia at <=1 , or something like that? - I'm not
> > sure), then even in a theoretical scenario where an attacker spawns
> > the maximum number of tasks possible and manages to get all of them to
> > sequentially preempt while being in the middle of increment operations
> > in several nested contexts (I'm not sure whether that can even happen
> > - you're not going to take typical sleeping exceptions like page
> > faults in the middle of a refcount op), the attacker will stay
> > comfortably inside the saturated range. Even if the PID_MAX_LIMIT is
> > at some point raised to the maximum permitted by the futex UAPI, this
> > still holds as long as you assume no nesting. Hm, should I send a
> > patch to add something like this to the explanatory comment?
>
> Sure, I'd be happy to improve the document by adding this -- please send
> out a patch for review. It's probably also worth mentioning the batching
> use-cases, although I struggle to reason about the window between the
> {under,over}flow occuring and saturation.

In the overflow case, it's fine, right? If you briefly crossed into
the saturation range and then went back down, using some tasks with
half-completed refcounting operations, then the refcount is still
behaving as a correct non-saturating refcount. (And it can't cross
over to the other end of the saturation range, because that's twice as
much distance as you'd need to unpin a saturated refcount.)

And in the underflow case, we can't deterministically protect anyway
without some external mechanism to protect the object's lifetime while
someone is already freeing it - so that's pretty much just a
best-effort thing anyway.

> > Of course, if someone uses refcount batching with sufficiently large
> > values, those guarantees go out of the window - if we wanted to be
> > perfectionist about this, we could make the batched operations do slow
> > cmpxchg stuff while letting the much more performance-critical
> > single-reference case continue to use the fast saturation scheme.
> > OTOH, the networking folks would probably hate that, since they're
> > using the batched ops for ->sk_wmem_alloc stuff, where they count
> > bytes as references? So I guess maybe we should leave it as-is.
>
> Agreed. If we hamper the performance here, then people will either look
> to disable the checking or they will switch to atomic_t, which puts us
> back to square one. Perfection is the enemy of the good and all that.
>
> Having said that, I'd still be keen to learn about any practical attacks
> on this in case there is something smart we can do to mitigate them
> without cmpxchg(). For example, one silly approach might be to bound the
> maximum increment and split larger ones up using a loop.

I guess that'd do the job. I don't know whether it's worth the trouble
in practice though.