LinuxLists.cc - [PATCH v2 2/6] ubsan: Reintroduce signed and unsigned overflow sanitizers

2024-02-02 10:16:53

Subject: [PATCH v2 2/6] ubsan: Reintroduce signed and unsigned overflow sanitizers

Effectively revert commit 6aaa31aeb9cf ("ubsan: remove overflow
checks"), to allow the kernel to be built with the "overflow"
sanitizers again. This gives developers a chance to experiment[1][2][3]
with the instrumentation again, while compilers adjust their sanitizers
to deal with the impact of -fno-strict-oveflow (i.e. moving from
"overflow" checking to "wrap-around" checking).

Notably, the naming of the options is adjusted to use the name "WRAP"
instead of "OVERFLOW". In the strictest sense, arithmetic "overflow"
happens when a result exceeds the storage of the type, and is considered
by the C standard and compilers to be undefined behavior for signed
and pointer types (without -fno-strict-overflow). Unsigned arithmetic
overflow is defined as always wrapping around.

Because the kernel is built with -fno-strict-overflow, signed and pointer
arithmetic is defined to always wrap around instead of "overflowing"
(which could either be elided due to being undefined behavior or would
wrap around, which led to very weird bugs in the kernel).

So, the config options are added back as CONFIG_UBSAN_SIGNED_WRAP and
CONFIG_UBSAN_UNSIGNED_WRAP. Since the kernel has several places that
explicitly depend on wrap-around behavior (e.g. counters, atomics, crypto,
etc), also introduce the __signed_wrap and __unsigned_wrap function
attributes for annotating functions where wrapping is expected and should
not be instrumented. This will allow us to distinguish in the kernel
between intentional and unintentional cases of arithmetic wrap-around.

Additionally keep these disabled under CONFIG_COMPILE_TEST for now.

Link: https://github.com/KSPP/linux/issues/26 [1]
Link: https://github.com/KSPP/linux/issues/27 [2]
Link: https://github.com/KSPP/linux/issues/344 [3]
Cc: Justin Stitt <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: Przemek Kitszel <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
---
include/linux/compiler_types.h | 14 ++++++-
lib/Kconfig.ubsan | 19 ++++++++++
lib/test_ubsan.c | 49 ++++++++++++++++++++++++
lib/ubsan.c | 68 ++++++++++++++++++++++++++++++++++
lib/ubsan.h | 4 ++
scripts/Makefile.ubsan | 2 +
6 files changed, 155 insertions(+), 1 deletion(-)

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 6f1ca49306d2..e585614f3152 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -282,11 +282,23 @@ struct ftrace_likely_data {
#define __no_sanitize_or_inline __always_inline
#endif

+/* Allow wrapping arithmetic within an annotated function. */
+#ifdef CONFIG_UBSAN_SIGNED_WRAP
+# define __signed_wrap __attribute__((no_sanitize("signed-integer-overflow")))
+#else
+# define __signed_wrap
+#endif
+#ifdef CONFIG_UBSAN_UNSIGNED_WRAP
+# define __unsigned_wrap __attribute__((no_sanitize("unsigned-integer-overflow")))
+#else
+# define __unsigned_wrap
+#endif
+
/* Section for code which can't be instrumented at all */
#define __noinstr_section(section) \
noinline notrace __attribute((__section__(section))) \
__no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage \
- __no_sanitize_memory
+ __no_sanitize_memory __signed_wrap __unsigned_wrap

#define noinstr __noinstr_section(".noinstr.text")

diff --git a/lib/Kconfig.ubsan b/lib/Kconfig.ubsan
index 59e21bfec188..a7003e5bd2a1 100644
--- a/lib/Kconfig.ubsan
+++ b/lib/Kconfig.ubsan
@@ -116,6 +116,25 @@ config UBSAN_UNREACHABLE
This option enables -fsanitize=unreachable which checks for control
flow reaching an expected-to-be-unreachable position.

+config UBSAN_SIGNED_WRAP
+ bool "Perform checking for signed arithmetic wrap-around"
+ default UBSAN
+ depends on !COMPILE_TEST
+ depends on $(cc-option,-fsanitize=signed-integer-overflow)
+ help
+ This option enables -fsanitize=signed-integer-overflow which checks
+ for wrap-around of any arithmetic operations with signed integers.
+
+config UBSAN_UNSIGNED_WRAP
+ bool "Perform checking for unsigned arithmetic wrap-around"
+ depends on $(cc-option,-fsanitize=unsigned-integer-overflow)
+ depends on !X86_32 # avoid excessive stack usage on x86-32/clang
+ depends on !COMPILE_TEST
+ help
+ This option enables -fsanitize=unsigned-integer-overflow which checks
+ for wrap-around of any arithmetic operations with unsigned integers. This
+ currently causes x86 to fail to boot.
+
config UBSAN_BOOL
bool "Perform checking for non-boolean values used as boolean"
default UBSAN
diff --git a/lib/test_ubsan.c b/lib/test_ubsan.c
index 2062be1f2e80..84d8092d6c32 100644
--- a/lib/test_ubsan.c
+++ b/lib/test_ubsan.c
@@ -11,6 +11,51 @@ typedef void(*test_ubsan_fp)(void);
#config, IS_ENABLED(config) ? "y" : "n"); \
} while (0)

+static void test_ubsan_add_overflow(void)
+{
+ volatile int val = INT_MAX;
+ volatile unsigned int uval = UINT_MAX;
+
+ UBSAN_TEST(CONFIG_UBSAN_SIGNED_WRAP);
+ val += 2;
+
+ UBSAN_TEST(CONFIG_UBSAN_UNSIGNED_WRAP);
+ uval += 2;
+}
+
+static void test_ubsan_sub_overflow(void)
+{
+ volatile int val = INT_MIN;
+ volatile unsigned int uval = 0;
+ volatile int val2 = 2;
+
+ UBSAN_TEST(CONFIG_UBSAN_SIGNED_WRAP);
+ val -= val2;
+
+ UBSAN_TEST(CONFIG_UBSAN_UNSIGNED_WRAP);
+ uval -= val2;
+}
+
+static void test_ubsan_mul_overflow(void)
+{
+ volatile int val = INT_MAX / 2;
+ volatile unsigned int uval = UINT_MAX / 2;
+
+ UBSAN_TEST(CONFIG_UBSAN_SIGNED_WRAP);
+ val *= 3;
+
+ UBSAN_TEST(CONFIG_UBSAN_UNSIGNED_WRAP);
+ uval *= 3;
+}
+
+static void test_ubsan_negate_overflow(void)
+{
+ volatile int val = INT_MIN;
+
+ UBSAN_TEST(CONFIG_UBSAN_SIGNED_WRAP);
+ val = -val;
+}
+
static void test_ubsan_divrem_overflow(void)
{
volatile int val = 16;
@@ -90,6 +135,10 @@ static void test_ubsan_misaligned_access(void)
}

static const test_ubsan_fp test_ubsan_array[] = {
+ test_ubsan_add_overflow,
+ test_ubsan_sub_overflow,
+ test_ubsan_mul_overflow,
+ test_ubsan_negate_overflow,
test_ubsan_shift_out_of_bounds,
test_ubsan_out_of_bounds,
test_ubsan_load_invalid_value,
diff --git a/lib/ubsan.c b/lib/ubsan.c
index df4f8d1354bb..5fc107f61934 100644
--- a/lib/ubsan.c
+++ b/lib/ubsan.c
@@ -222,6 +222,74 @@ static void ubsan_epilogue(void)
check_panic_on_warn("UBSAN");
}

+static void handle_overflow(struct overflow_data *data, void *lhs,
+ void *rhs, char op)
+{
+
+ struct type_descriptor *type = data->type;
+ char lhs_val_str[VALUE_LENGTH];
+ char rhs_val_str[VALUE_LENGTH];
+
+ if (suppress_report(&data->location))
+ return;
+
+ ubsan_prologue(&data->location, type_is_signed(type) ?
+ "signed-integer-overflow" :
+ "unsigned-integer-overflow");
+
+ val_to_string(lhs_val_str, sizeof(lhs_val_str), type, lhs);
+ val_to_string(rhs_val_str, sizeof(rhs_val_str), type, rhs);
+ pr_err("%s %c %s cannot be represented in type %s\n",
+ lhs_val_str,
+ op,
+ rhs_val_str,
+ type->type_name);
+
+ ubsan_epilogue();
+}
+
+void __ubsan_handle_add_overflow(void *data,
+ void *lhs, void *rhs)
+{
+
+ handle_overflow(data, lhs, rhs, '+');
+}
+EXPORT_SYMBOL(__ubsan_handle_add_overflow);
+
+void __ubsan_handle_sub_overflow(void *data,
+ void *lhs, void *rhs)
+{
+ handle_overflow(data, lhs, rhs, '-');
+}
+EXPORT_SYMBOL(__ubsan_handle_sub_overflow);
+
+void __ubsan_handle_mul_overflow(void *data,
+ void *lhs, void *rhs)
+{
+ handle_overflow(data, lhs, rhs, '*');
+}
+EXPORT_SYMBOL(__ubsan_handle_mul_overflow);
+
+void __ubsan_handle_negate_overflow(void *_data, void *old_val)
+{
+ struct overflow_data *data = _data;
+ char old_val_str[VALUE_LENGTH];
+
+ if (suppress_report(&data->location))
+ return;
+
+ ubsan_prologue(&data->location, "negation-overflow");
+
+ val_to_string(old_val_str, sizeof(old_val_str), data->type, old_val);
+
+ pr_err("negation of %s cannot be represented in type %s:\n",
+ old_val_str, data->type->type_name);
+
+ ubsan_epilogue();
+}
+EXPORT_SYMBOL(__ubsan_handle_negate_overflow);
+
+
void __ubsan_handle_divrem_overflow(void *_data, void *lhs, void *rhs)
{
struct overflow_data *data = _data;
diff --git a/lib/ubsan.h b/lib/ubsan.h
index 5d99ab81913b..0abbbac8700d 100644
--- a/lib/ubsan.h
+++ b/lib/ubsan.h
@@ -124,6 +124,10 @@ typedef s64 s_max;
typedef u64 u_max;
#endif

+void __ubsan_handle_add_overflow(void *data, void *lhs, void *rhs);
+void __ubsan_handle_sub_overflow(void *data, void *lhs, void *rhs);
+void __ubsan_handle_mul_overflow(void *data, void *lhs, void *rhs);
+void __ubsan_handle_negate_overflow(void *_data, void *old_val);
void __ubsan_handle_divrem_overflow(void *_data, void *lhs, void *rhs);
void __ubsan_handle_type_mismatch(struct type_mismatch_data *data, void *ptr);
void __ubsan_handle_type_mismatch_v1(void *_data, void *ptr);
diff --git a/scripts/Makefile.ubsan b/scripts/Makefile.ubsan
index 7cf42231042b..7b2f3d554c59 100644
--- a/scripts/Makefile.ubsan
+++ b/scripts/Makefile.ubsan
@@ -8,6 +8,8 @@ ubsan-cflags-$(CONFIG_UBSAN_LOCAL_BOUNDS) += -fsanitize=local-bounds
ubsan-cflags-$(CONFIG_UBSAN_SHIFT) += -fsanitize=shift
ubsan-cflags-$(CONFIG_UBSAN_DIV_ZERO) += -fsanitize=integer-divide-by-zero
ubsan-cflags-$(CONFIG_UBSAN_UNREACHABLE) += -fsanitize=unreachable
+ubsan-cflags-$(CONFIG_UBSAN_SIGNED_WRAP) += -fsanitize=signed-integer-overflow
+ubsan-cflags-$(CONFIG_UBSAN_UNSIGNED_WRAP) += -fsanitize=unsigned-integer-overflow
ubsan-cflags-$(CONFIG_UBSAN_BOOL) += -fsanitize=bool
ubsan-cflags-$(CONFIG_UBSAN_ENUM) += -fsanitize=enum
ubsan-cflags-$(CONFIG_UBSAN_TRAP) += $(call cc-option,-fsanitize-trap=undefined,-fsanitize-undefined-trap-on-error)
--
2.34.1

2024-02-02 12:17:19

by Kees Cook

[permalink] [raw]

Subject: Re: [PATCH v2 2/6] ubsan: Reintroduce signed and unsigned overflow sanitizers

On Fri, Feb 02, 2024 at 12:01:55PM +0100, Marco Elver wrote:
> On Fri, 2 Feb 2024 at 11:16, Kees Cook <[email protected]> wrote:
> > [...]
> > +config UBSAN_UNSIGNED_WRAP
> > + bool "Perform checking for unsigned arithmetic wrap-around"
> > + depends on $(cc-option,-fsanitize=unsigned-integer-overflow)
> > + depends on !X86_32 # avoid excessive stack usage on x86-32/clang
> > + depends on !COMPILE_TEST
> > + help
> > + This option enables -fsanitize=unsigned-integer-overflow which checks
> > + for wrap-around of any arithmetic operations with unsigned integers. This
> > + currently causes x86 to fail to boot.
>
> My hypothesis is that these options will quickly be enabled by various
> test and fuzzing setups, to the detriment of kernel developers. While
> the commit message states that these are for experimentation, I do not
> think it is at all clear from the Kconfig options.

I can certainly rephrase it more strongly. I would hope that anyone
enabling the unsigned sanitizer would quickly realize how extremely
noisy it is.

> Unsigned integer wrap-around is relatively common (it is _not_ UB
> after all). While I can appreciate that in some cases wrap around is a
> genuine semantic bug, and that's what we want to find with these
> changes, ultimately marking all semantically valid wrap arounds to
> catch the unmarked ones. Given these patterns are so common, and C
> programmers are used to them, it will take a lot of effort to mark all
> the intentional cases. But I fear that even if we get to that place,
> _unmarked_ but semantically valid unsigned wrap around will keep
> popping up again and again.

I agree -- it's going to be quite a challenge. My short-term goal is to
see how far the sanitizer itself can get with identifying intentional
uses. For example, I found two more extremely common code patterns that
trip it now:

unsigned int i = ...;
...
while (i--) { ... }

This trips the sanitizer at loop exit. :P It seems like churn to
refactor all of these into "for (; i; i--)". The compiler should be able
to identify this by looking for later uses of "i", etc.

The other is negative constants: -1UL, -3ULL, etc. These are all over
the place and very very obviously intentional and should be ignored by
the compiler.

> What is the long-term vision to minimize the additional churn this may
> introduce?

My hope is that we can evolve the coverage over time. Solving it all at
once won't be possible, but I think we can get pretty far with the
signed overflow sanitizer, which runs relatively cleanly already.

If we can't make meaningful progress in unsigned annotations, I think
we'll have to work on gaining type-based operator overloading so we can
grow type-aware arithmetic. That will serve as a much cleaner
annotation. E.g. introduce jiffie_t, which wraps.

> I think the problem reminds me a little of the data race problem,
> although I suspect unsigned integer wraparound is much more common
> than data races (which unlike unsigned wrap around is actually UB) -
> so chasing all intentional unsigned integer wrap arounds and marking
> will take even more effort than marking all intentional data races
> (which we're still slowly, but steadily, making progress towards).
>
> At the very least, these options should 'depends on EXPERT' or even
> 'depends on BROKEN' while the story is still being worked out.

Perhaps I should hold off on bringing the unsigned sanitizer back? I was
hoping to work in parallel with the signed sanitizer, but maybe this
isn't the right approach?

--
Kees Cook

2024-02-02 13:45:17

by Marco Elver

[permalink] [raw]

Subject: Re: [PATCH v2 2/6] ubsan: Reintroduce signed and unsigned overflow sanitizers

On Fri, 2 Feb 2024 at 13:17, Kees Cook <[email protected]> wrote:
>
> On Fri, Feb 02, 2024 at 12:01:55PM +0100, Marco Elver wrote:
> > On Fri, 2 Feb 2024 at 11:16, Kees Cook <[email protected]> wrote:
> > > [...]
> > > +config UBSAN_UNSIGNED_WRAP
> > > + bool "Perform checking for unsigned arithmetic wrap-around"
> > > + depends on $(cc-option,-fsanitize=unsigned-integer-overflow)
> > > + depends on !X86_32 # avoid excessive stack usage on x86-32/clang
> > > + depends on !COMPILE_TEST
> > > + help
> > > + This option enables -fsanitize=unsigned-integer-overflow which checks
> > > + for wrap-around of any arithmetic operations with unsigned integers. This
> > > + currently causes x86 to fail to boot.
> >
> > My hypothesis is that these options will quickly be enabled by various
> > test and fuzzing setups, to the detriment of kernel developers. While
> > the commit message states that these are for experimentation, I do not
> > think it is at all clear from the Kconfig options.
>
> I can certainly rephrase it more strongly. I would hope that anyone
> enabling the unsigned sanitizer would quickly realize how extremely
> noisy it is.
>
> > Unsigned integer wrap-around is relatively common (it is _not_ UB
> > after all). While I can appreciate that in some cases wrap around is a
> > genuine semantic bug, and that's what we want to find with these
> > changes, ultimately marking all semantically valid wrap arounds to
> > catch the unmarked ones. Given these patterns are so common, and C
> > programmers are used to them, it will take a lot of effort to mark all
> > the intentional cases. But I fear that even if we get to that place,
> > _unmarked_ but semantically valid unsigned wrap around will keep
> > popping up again and again.
>
> I agree -- it's going to be quite a challenge. My short-term goal is to
> see how far the sanitizer itself can get with identifying intentional
> uses. For example, I found two more extremely common code patterns that
> trip it now:
>
> unsigned int i = ...;
> ...
> while (i--) { ... }
>
> This trips the sanitizer at loop exit. :P It seems like churn to
> refactor all of these into "for (; i; i--)". The compiler should be able
> to identify this by looking for later uses of "i", etc.
>
> The other is negative constants: -1UL, -3ULL, etc. These are all over
> the place and very very obviously intentional and should be ignored by
> the compiler.

Yeah, banning technically valid code like this is going to be a very hard sell.

> > What is the long-term vision to minimize the additional churn this may
> > introduce?
>
> My hope is that we can evolve the coverage over time. Solving it all at
> once won't be possible, but I think we can get pretty far with the
> signed overflow sanitizer, which runs relatively cleanly already.
>
> If we can't make meaningful progress in unsigned annotations, I think
> we'll have to work on gaining type-based operator overloading so we can
> grow type-aware arithmetic. That will serve as a much cleaner
> annotation. E.g. introduce jiffie_t, which wraps.
>
> > I think the problem reminds me a little of the data race problem,
> > although I suspect unsigned integer wraparound is much more common
> > than data races (which unlike unsigned wrap around is actually UB) -
> > so chasing all intentional unsigned integer wrap arounds and marking
> > will take even more effort than marking all intentional data races
> > (which we're still slowly, but steadily, making progress towards).
> >
> > At the very least, these options should 'depends on EXPERT' or even
> > 'depends on BROKEN' while the story is still being worked out.
>
> Perhaps I should hold off on bringing the unsigned sanitizer back? I was
> hoping to work in parallel with the signed sanitizer, but maybe this
> isn't the right approach?

I leave that to you - to me any of these options would be ok:

1. Remove completely for now.

2. Make it 'depends on BROKEN' (because I think even 'depends on
EXPERT' won't help avoid the inevitable spam from test robots).

3. Make it a purely opt-in sanitizer: rather than having subsystems
opt out with UBSAN_WRAP_UNSIGNED:=n, do the opposite and say that for
subsystems that want to opt in, they have to specify
UBSAN_WRAP_UNSIGNED:=y to explicitly opt in.

I can see there being value in explicitly marking semantically
intended unsigned integer wrap, and catch unintended cases, so option
#3 seems appealing. At least that way, if a maintainer chooses to opt
in, they are committed to sorting out their code. Hypothetically, if I
was the maintainer of some smaller subsystem and have had wrap around
bugs in the past, I would certainly consider opting in. It feels a lot
nicer than having it forced upon me.

Thanks,
-- Marco

2024-02-02 16:18:56

by Miguel Ojeda

[permalink] [raw]

Subject: Re: [PATCH v2 2/6] ubsan: Reintroduce signed and unsigned overflow sanitizers

On Fri, Feb 2, 2024 at 1:17 PM Kees Cook <[email protected]> wrote:
>
> Perhaps I should hold off on bringing the unsigned sanitizer back? I was
> hoping to work in parallel with the signed sanitizer, but maybe this
> isn't the right approach?

If you can do anything to keep it in-tree, I think it would be nice so
that others can easily use it to test the tooling and to start to
clean up cases. A per-subsystem opt-in like Marco says could be a way,
and you could perhaps do one very small subsystem or similar to see
how it would look like.

Something that could also help would be to split the cases even
further (say, only overflows and not underflows), but is that a
possibility with the current tooling?

Thanks for working on this, Kees!

Cheers,
Miguel