2020-11-22 16:27:48

by Francis Laniel

[permalink] [raw]
Subject: [PATCH v7 0/5] Fortify strscpy()

From: Francis Laniel <[email protected]>

Hi.


I hope your families, friends and yourselves are fine.

This patch implements a fortified version of strscpy() enabled by setting
CONFIG_FORTIFY_SOURCE=y.
The new version ensures the following before calling vanilla strscpy():
1. There is no read overflow because either size is smaller than src length
or we shrink size to src length by calling fortified strnlen().
2. There is no write overflow because we either failed during compilation or at
runtime by checking that size is smaller than dest size.
Note that, if src and dst size cannot be got, the patch defaults to call vanilla
strscpy().

The patches adds the following:
1. Implement the fortified version of strscpy().
2. Add a new LKDTM test to ensures the fortified version still returns the same
value as the vanilla one while panic'ing when there is a write overflow.
3. Correct some typos in LKDTM related file.

I based my modifications on top of two patches from Daniel Axtens which modify
calls to __builtin_object_size, in fortified string functions, to ensure the
true size of char * are returned and not the surrounding structure size.

About performance, I measured the slow down of fortified strscpy(), using the
vanilla one as baseline.
The hardware I used is an Intel i3 2130 CPU clocked at 3.4 GHz.
I ran "Linux 5.10.0-rc4+ SMP PREEMPT" inside qemu 3.10 with 4 CPU cores.
The following code, called through LKDTM, was used as a benchmark:
#define TIMES 10000
char *src;
char dst[7];
int i;
ktime_t begin;

src = kstrdup("foobar", GFP_KERNEL);

if (src == NULL)
return;

begin = ktime_get();
for (i = 0; i < TIMES; i++)
strscpy(dst, src, strlen(src));
pr_info("%d fortified strscpy() tooks %lld", TIMES, ktime_get() - begin);

begin = ktime_get();
for (i = 0; i < TIMES; i++)
__real_strscpy(dst, src, strlen(src));
pr_info("%d vanilla strscpy() tooks %lld", TIMES, ktime_get() - begin);

kfree(src);
I called the above code 30 times to compute stats for each version (in ns, round
to int):
| version | mean | std | median | 95th |
| --------- | ------- | ------ | ------- | ------- |
| fortified | 245_069 | 54_657 | 216_230 | 331_122 |
| vanilla | 172_501 | 70_281 | 143_539 | 219_553 |
On average, fortified strscpy() is approximately 1.42 times slower than vanilla
strscpy().
For the 95th percentile, the fortified version is about 1.50 times slower.

So, clearly the stats are not in favor of fortified strscpy().
But, the fortified version loops the string twice (one in strnlen() and another
in vanilla strscpy()) while the vanilla one only loops once.
This can explain why fortified strscpy() is slower than the vanilla one.

Please, let me know your opinion about the patch and if you have any idea to
improve it.


Best regards.

Daniel Axtens (2):
string.h: detect intra-object overflow in fortified string functions
lkdtm: tests for FORTIFY_SOURCE

Francis Laniel (3):
string.h: Add FORTIFY coverage for strscpy()
Add new file in LKDTM to test fortified strscpy.
Correct wrong filenames in comment.

drivers/misc/lkdtm/Makefile | 1 +
drivers/misc/lkdtm/bugs.c | 50 +++++++++++++++
drivers/misc/lkdtm/core.c | 3 +
drivers/misc/lkdtm/fortify.c | 82 +++++++++++++++++++++++++
drivers/misc/lkdtm/lkdtm.h | 19 +++---
include/linux/string.h | 75 ++++++++++++++++++----
tools/testing/selftests/lkdtm/tests.txt | 1 +
7 files changed, 213 insertions(+), 18 deletions(-)
create mode 100644 drivers/misc/lkdtm/fortify.c

--
2.20.1


2020-11-22 16:27:53

by Francis Laniel

[permalink] [raw]
Subject: [PATCH v7 1/5] string.h: detect intra-object overflow in fortified string functions

From: Daniel Axtens <[email protected]>

When the fortify feature was first introduced in commit 6974f0c4555e
("include/linux/string.h: add the option of fortified string.h functions"),
Daniel Micay observed:

* It should be possible to optionally use __builtin_object_size(x, 1) for
some functions (C strings) to detect intra-object overflows (like
glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative
approach to avoid likely compatibility issues.

This is a case that often cannot be caught by KASAN. Consider:

struct foo {
char a[10];
char b[10];
}

void test() {
char *msg;
struct foo foo;

msg = kmalloc(16, GFP_KERNEL);
strcpy(msg, "Hello world!!");
// this copy overwrites foo.b
strcpy(foo.a, msg);
}

The questionable copy overflows foo.a and writes to foo.b as well. It
cannot be detected by KASAN. Currently it is also not detected by fortify,
because strcpy considers __builtin_object_size(x, 0), which considers the
size of the surrounding object (here, struct foo). However, if we switch
the string functions over to use __builtin_object_size(x, 1), the compiler
will measure the size of the closest surrounding subobject (here, foo.a),
rather than the size of the surrounding object as a whole. See
https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html for more info.

Only do this for string functions: we cannot use it on things like
memcpy, memmove, memcmp and memchr_inv due to code like this which
purposefully operates on multiple structure members:
(arch/x86/kernel/traps.c)

/*
* regs->sp points to the failing IRET frame on the
* ESPFIX64 stack. Copy it to the entry stack. This fills
* in gpregs->ss through gpregs->ip.
*
*/
memmove(&gpregs->ip, (void *)regs->sp, 5*8);

This change passes an allyesconfig on powerpc and x86, and an x86 kernel
built with it survives running with syz-stress from syzkaller, so it seems
safe so far.

Cc: Daniel Micay <[email protected]>
Cc: Kees Cook <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Daniel Axtens <[email protected]>
Signed-off-by: Francis Laniel <[email protected]>
---
include/linux/string.h | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/include/linux/string.h b/include/linux/string.h
index b1f3894a0a3e..46e91d684c47 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -292,7 +292,7 @@ extern char *__underlying_strncpy(char *p, const char *q, __kernel_size_t size)

__FORTIFY_INLINE char *strncpy(char *p, const char *q, __kernel_size_t size)
{
- size_t p_size = __builtin_object_size(p, 0);
+ size_t p_size = __builtin_object_size(p, 1);
if (__builtin_constant_p(size) && p_size < size)
__write_overflow();
if (p_size < size)
@@ -302,7 +302,7 @@ __FORTIFY_INLINE char *strncpy(char *p, const char *q, __kernel_size_t size)

__FORTIFY_INLINE char *strcat(char *p, const char *q)
{
- size_t p_size = __builtin_object_size(p, 0);
+ size_t p_size = __builtin_object_size(p, 1);
if (p_size == (size_t)-1)
return __underlying_strcat(p, q);
if (strlcat(p, q, p_size) >= p_size)
@@ -313,7 +313,7 @@ __FORTIFY_INLINE char *strcat(char *p, const char *q)
__FORTIFY_INLINE __kernel_size_t strlen(const char *p)
{
__kernel_size_t ret;
- size_t p_size = __builtin_object_size(p, 0);
+ size_t p_size = __builtin_object_size(p, 1);

/* Work around gcc excess stack consumption issue */
if (p_size == (size_t)-1 ||
@@ -328,7 +328,7 @@ __FORTIFY_INLINE __kernel_size_t strlen(const char *p)
extern __kernel_size_t __real_strnlen(const char *, __kernel_size_t) __RENAME(strnlen);
__FORTIFY_INLINE __kernel_size_t strnlen(const char *p, __kernel_size_t maxlen)
{
- size_t p_size = __builtin_object_size(p, 0);
+ size_t p_size = __builtin_object_size(p, 1);
__kernel_size_t ret = __real_strnlen(p, maxlen < p_size ? maxlen : p_size);
if (p_size <= ret && maxlen != ret)
fortify_panic(__func__);
@@ -340,8 +340,8 @@ extern size_t __real_strlcpy(char *, const char *, size_t) __RENAME(strlcpy);
__FORTIFY_INLINE size_t strlcpy(char *p, const char *q, size_t size)
{
size_t ret;
- size_t p_size = __builtin_object_size(p, 0);
- size_t q_size = __builtin_object_size(q, 0);
+ size_t p_size = __builtin_object_size(p, 1);
+ size_t q_size = __builtin_object_size(q, 1);
if (p_size == (size_t)-1 && q_size == (size_t)-1)
return __real_strlcpy(p, q, size);
ret = strlen(q);
@@ -361,8 +361,8 @@ __FORTIFY_INLINE size_t strlcpy(char *p, const char *q, size_t size)
__FORTIFY_INLINE char *strncat(char *p, const char *q, __kernel_size_t count)
{
size_t p_len, copy_len;
- size_t p_size = __builtin_object_size(p, 0);
- size_t q_size = __builtin_object_size(q, 0);
+ size_t p_size = __builtin_object_size(p, 1);
+ size_t q_size = __builtin_object_size(q, 1);
if (p_size == (size_t)-1 && q_size == (size_t)-1)
return __underlying_strncat(p, q, count);
p_len = strlen(p);
@@ -475,11 +475,16 @@ __FORTIFY_INLINE void *kmemdup(const void *p, size_t size, gfp_t gfp)
/* defined after fortified strlen and memcpy to reuse them */
__FORTIFY_INLINE char *strcpy(char *p, const char *q)
{
- size_t p_size = __builtin_object_size(p, 0);
- size_t q_size = __builtin_object_size(q, 0);
+ size_t p_size = __builtin_object_size(p, 1);
+ size_t q_size = __builtin_object_size(q, 1);
+ size_t size;
if (p_size == (size_t)-1 && q_size == (size_t)-1)
return __underlying_strcpy(p, q);
- memcpy(p, q, strlen(q) + 1);
+ size = strlen(q) + 1;
+ /* test here to use the more stringent object size */
+ if (p_size < size)
+ fortify_panic(__func__);
+ memcpy(p, q, size);
return p;
}

--
2.20.1

2020-11-22 16:27:58

by Francis Laniel

[permalink] [raw]
Subject: [PATCH v7 2/5] lkdtm: tests for FORTIFY_SOURCE

From: Daniel Axtens <[email protected]>

Add code to test both:

- runtime detection of the overrun of a structure. This covers the
__builtin_object_size(x, 0) case. This test is called FORTIFY_OBJECT.

- runtime detection of the overrun of a char array within a structure.
This covers the __builtin_object_size(x, 1) case which can be used
for some string functions. This test is called FORTIFY_SUBOBJECT.

Suggested-by: Kees Cook <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Daniel Axtens <[email protected]>
Signed-off-by: Francis Laniel <[email protected]>
---
drivers/misc/lkdtm/bugs.c | 50 ++++++++++++++++++++++++++++++++++++++
drivers/misc/lkdtm/core.c | 2 ++
drivers/misc/lkdtm/lkdtm.h | 2 ++
3 files changed, 54 insertions(+)

diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
index a0675d4154d2..110f5a8538e9 100644
--- a/drivers/misc/lkdtm/bugs.c
+++ b/drivers/misc/lkdtm/bugs.c
@@ -482,3 +482,53 @@ noinline void lkdtm_CORRUPT_PAC(void)
pr_err("XFAIL: this test is arm64-only\n");
#endif
}
+
+void lkdtm_FORTIFY_OBJECT(void)
+{
+ struct target {
+ char a[10];
+ } target[2] = {};
+ int result;
+
+ /*
+ * Using volatile prevents the compiler from determining the value of
+ * 'size' at compile time. Without that, we would get a compile error
+ * rather than a runtime error.
+ */
+ volatile int size = 11;
+
+ pr_info("trying to read past the end of a struct\n");
+
+ result = memcmp(&target[0], &target[1], size);
+
+ /* Print result to prevent the code from being eliminated */
+ pr_err("FAIL: fortify did not catch an object overread!\n"
+ "\"%d\" was the memcmp result.\n", result);
+}
+
+void lkdtm_FORTIFY_SUBOBJECT(void)
+{
+ struct target {
+ char a[10];
+ char b[10];
+ } target;
+ char *src;
+
+ src = kmalloc(20, GFP_KERNEL);
+ strscpy(src, "over ten bytes", 20);
+
+ pr_info("trying to strcpy past the end of a member of a struct\n");
+
+ /*
+ * strncpy(target.a, src, 20); will hit a compile error because the
+ * compiler knows at build time that target.a < 20 bytes. Use strcpy()
+ * to force a runtime error.
+ */
+ strcpy(target.a, src);
+
+ /* Use target.a to prevent the code from being eliminated */
+ pr_err("FAIL: fortify did not catch an sub-object overrun!\n"
+ "\"%s\" was copied.\n", target.a);
+
+ kfree(src);
+}
diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index 97803f213d9d..b8c51a633fcc 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -117,6 +117,8 @@ static const struct crashtype crashtypes[] = {
CRASHTYPE(UNSET_SMEP),
CRASHTYPE(CORRUPT_PAC),
CRASHTYPE(UNALIGNED_LOAD_STORE_WRITE),
+ CRASHTYPE(FORTIFY_OBJECT),
+ CRASHTYPE(FORTIFY_SUBOBJECT),
CRASHTYPE(OVERWRITE_ALLOCATION),
CRASHTYPE(WRITE_AFTER_FREE),
CRASHTYPE(READ_AFTER_FREE),
diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index 6dec4c9b442f..49e6b945feb7 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -32,6 +32,8 @@ void lkdtm_STACK_GUARD_PAGE_TRAILING(void);
void lkdtm_UNSET_SMEP(void);
void lkdtm_DOUBLE_FAULT(void);
void lkdtm_CORRUPT_PAC(void);
+void lkdtm_FORTIFY_OBJECT(void);
+void lkdtm_FORTIFY_SUBOBJECT(void);

/* lkdtm_heap.c */
void __init lkdtm_heap_init(void);
--
2.20.1

2020-11-22 16:28:39

by Francis Laniel

[permalink] [raw]
Subject: [PATCH v7 5/5] Correct wrong filenames in comment.

From: Francis Laniel <[email protected]>

In lkdtm.h, files targeted in comments are named "lkdtm_file.c" while there are
named "file.c" in directory.

Acked-by: Kees Cook <[email protected]>
Signed-off-by: Francis Laniel <[email protected]>
---
drivers/misc/lkdtm/lkdtm.h | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index 138f06254b61..6aa6d6a1a839 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -6,7 +6,7 @@

#include <linux/kernel.h>

-/* lkdtm_bugs.c */
+/* bugs.c */
void __init lkdtm_bugs_init(int *recur_param);
void lkdtm_PANIC(void);
void lkdtm_BUG(void);
@@ -35,7 +35,7 @@ void lkdtm_CORRUPT_PAC(void);
void lkdtm_FORTIFY_OBJECT(void);
void lkdtm_FORTIFY_SUBOBJECT(void);

-/* lkdtm_heap.c */
+/* heap.c */
void __init lkdtm_heap_init(void);
void __exit lkdtm_heap_exit(void);
void lkdtm_OVERWRITE_ALLOCATION(void);
@@ -47,7 +47,7 @@ void lkdtm_SLAB_FREE_DOUBLE(void);
void lkdtm_SLAB_FREE_CROSS(void);
void lkdtm_SLAB_FREE_PAGE(void);

-/* lkdtm_perms.c */
+/* perms.c */
void __init lkdtm_perms_init(void);
void lkdtm_WRITE_RO(void);
void lkdtm_WRITE_RO_AFTER_INIT(void);
@@ -62,7 +62,7 @@ void lkdtm_EXEC_NULL(void);
void lkdtm_ACCESS_USERSPACE(void);
void lkdtm_ACCESS_NULL(void);

-/* lkdtm_refcount.c */
+/* refcount.c */
void lkdtm_REFCOUNT_INC_OVERFLOW(void);
void lkdtm_REFCOUNT_ADD_OVERFLOW(void);
void lkdtm_REFCOUNT_INC_NOT_ZERO_OVERFLOW(void);
@@ -83,10 +83,10 @@ void lkdtm_REFCOUNT_SUB_AND_TEST_SATURATED(void);
void lkdtm_REFCOUNT_TIMING(void);
void lkdtm_ATOMIC_TIMING(void);

-/* lkdtm_rodata.c */
+/* rodata.c */
void lkdtm_rodata_do_nothing(void);

-/* lkdtm_usercopy.c */
+/* usercopy.c */
void __init lkdtm_usercopy_init(void);
void __exit lkdtm_usercopy_exit(void);
void lkdtm_USERCOPY_HEAP_SIZE_TO(void);
@@ -98,7 +98,7 @@ void lkdtm_USERCOPY_STACK_FRAME_FROM(void);
void lkdtm_USERCOPY_STACK_BEYOND(void);
void lkdtm_USERCOPY_KERNEL(void);

-/* lkdtm_stackleak.c */
+/* stackleak.c */
void lkdtm_STACKLEAK_ERASING(void);

/* cfi.c */
--
2.20.1

2020-11-22 16:30:00

by Francis Laniel

[permalink] [raw]
Subject: [PATCH v7 3/5] string.h: Add FORTIFY coverage for strscpy()

From: Francis Laniel <[email protected]>

The fortified version of strscpy ensures the following before vanilla strscpy
is called:
1. There is no read overflow because we either size is smaller than src length
or we shrink size to src length by calling fortified strnlen.
2. There is no write overflow because we either failed during compilation or at
runtime by checking that size is smaller than dest size.

Acked-by: Kees Cook <[email protected]>
Signed-off-by: Francis Laniel <[email protected]>
---
include/linux/string.h | 48 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)

diff --git a/include/linux/string.h b/include/linux/string.h
index 46e91d684c47..1cd63a8a23ab 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -6,6 +6,7 @@
#include <linux/compiler.h> /* for inline */
#include <linux/types.h> /* for size_t */
#include <linux/stddef.h> /* for NULL */
+#include <linux/errno.h> /* for E2BIG */
#include <stdarg.h>
#include <uapi/linux/string.h>

@@ -357,6 +358,53 @@ __FORTIFY_INLINE size_t strlcpy(char *p, const char *q, size_t size)
return ret;
}

+/* defined after fortified strnlen to reuse it */
+extern ssize_t __real_strscpy(char *, const char *, size_t) __RENAME(strscpy);
+__FORTIFY_INLINE ssize_t strscpy(char *p, const char *q, size_t size)
+{
+ size_t len;
+ /* Use string size rather than possible enclosing struct size. */
+ size_t p_size = __builtin_object_size(p, 1);
+ size_t q_size = __builtin_object_size(q, 1);
+
+ /* If we cannot get size of p and q default to call strscpy. */
+ if (p_size == (size_t) -1 && q_size == (size_t) -1)
+ return __real_strscpy(p, q, size);
+
+ /*
+ * If size can be known at compile time and is greater than
+ * p_size, generate a compile time write overflow error.
+ */
+ if (__builtin_constant_p(size) && size > p_size)
+ __write_overflow();
+
+ /*
+ * This call protects from read overflow, because len will default to q
+ * length if it smaller than size.
+ */
+ len = strnlen(q, size);
+ /*
+ * If len equals size, we will copy only size bytes which leads to
+ * -E2BIG being returned.
+ * Otherwise we will copy len + 1 because of the final '\O'.
+ */
+ len = len == size ? size : len + 1;
+
+ /*
+ * Generate a runtime write overflow error if len is greater than
+ * p_size.
+ */
+ if (len > p_size)
+ fortify_panic(__func__);
+
+ /*
+ * We can now safely call vanilla strscpy because we are protected from:
+ * 1. Read overflow thanks to call to strnlen().
+ * 2. Write overflow thanks to above ifs.
+ */
+ return __real_strscpy(p, q, len);
+}
+
/* defined after fortified strlen and strnlen to reuse them */
__FORTIFY_INLINE char *strncat(char *p, const char *q, __kernel_size_t count)
{
--
2.20.1