LinuxLists.cc - [PATCH] lib/strscpy: avoid KASAN false positive

2017-07-18 17:13:29

Subject: [PATCH] lib/strscpy: avoid KASAN false positive

strscpy() performs the word-at-a-time optimistic reads. So it may
may access the memory past the end of the object, which is perfectly fine
since strscpy() doesn't use that (past-the-end) data and makes sure the
optimistic read won't cross a page boundary.

But KASAN doesn't know anything about that so it will complain.
Let's just fallback to the byte-at-a-time reads under CONFIG_KASAN=y
to avoid false-positives.

Reported-by: Dave Jones <[email protected]>
Signed-off-by: Andrey Ryabinin <[email protected]>
---
lib/string.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/lib/string.c b/lib/string.c
index ebbb99c775bd..8b93d2519d5a 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -199,6 +199,13 @@ ssize_t strscpy(char *dest, const char *src, size_t count)
max = 0;
#endif

+ /*
+ * KASAN won't be happy about word-at-a-time
+ * optimistic reads, so let's avoid them.
+ */
+ if (IS_ENABLED(CONFIG_KASAN))
+ max = 0;
+
while (max >= sizeof(unsigned long)) {
unsigned long c, data;

--
2.13.0

2017-07-18 17:14:50

by Dmitry Vyukov

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On Tue, Jul 18, 2017 at 7:15 PM, Andrey Ryabinin
<[email protected]> wrote:
> strscpy() performs the word-at-a-time optimistic reads. So it may
> may access the memory past the end of the object, which is perfectly fine
> since strscpy() doesn't use that (past-the-end) data and makes sure the
> optimistic read won't cross a page boundary.
>
> But KASAN doesn't know anything about that so it will complain.
> Let's just fallback to the byte-at-a-time reads under CONFIG_KASAN=y
> to avoid false-positives.

Acked-by: Dmitry Vyukov <[email protected]>

> Reported-by: Dave Jones <[email protected]>
> Signed-off-by: Andrey Ryabinin <[email protected]>
> ---
> lib/string.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/lib/string.c b/lib/string.c
> index ebbb99c775bd..8b93d2519d5a 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -199,6 +199,13 @@ ssize_t strscpy(char *dest, const char *src, size_t count)
> max = 0;
> #endif
>
> + /*
> + * KASAN won't be happy about word-at-a-time
> + * optimistic reads, so let's avoid them.
> + */
> + if (IS_ENABLED(CONFIG_KASAN))
> + max = 0;
> +
> while (max >= sizeof(unsigned long)) {
> unsigned long c, data;
>
> --
> 2.13.0
>

2017-07-18 17:22:47

by Linus Torvalds

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On Tue, Jul 18, 2017 at 10:15 AM, Andrey Ryabinin
<[email protected]> wrote:
>
> + /*
> + * KASAN won't be happy about word-at-a-time
> + * optimistic reads, so let's avoid them.
> + */
> + if (IS_ENABLED(CONFIG_KASAN))
> + max = 0;
> +

No, please don't.

Two reasons:

(a) it turns out that KASAN doesn't actually warn about this when
there aren't buggy users (because we only do word-at-a-time in the
spacified-to-be-safe region anyway).

(b) havign automated testing that then just changes semantics and
implementation of what is tested is a bad bad bad idea.

So (a) says that we shouldn't need it in the first place, and (b) says
that we should avoid KASAN changing behavior unless we absolutely
*have* to.

In fact, I think we should *never* have that kind of "KASAN changes
semantics". If there is some particular load that is known to be
problematic for KASAN, we *still* shouldn't change semantics, we
should just mark that single load as being unchecked by KASAN.

Linus

2017-07-18 20:16:01

by Andrey Ryabinin

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On 07/18/2017 08:22 PM, Linus Torvalds wrote:
> On Tue, Jul 18, 2017 at 10:15 AM, Andrey Ryabinin
> <[email protected]> wrote:
>>
>> + /*
>> + * KASAN won't be happy about word-at-a-time
>> + * optimistic reads, so let's avoid them.
>> + */
>> + if (IS_ENABLED(CONFIG_KASAN))
>> + max = 0;
>> +
>
> No, please don't.
>
> Two reasons:
>
> (a) it turns out that KASAN doesn't actually warn about this when
> there aren't buggy users (because we only do word-at-a-time in the
> spacified-to-be-safe region anyway).
>

No, it does warn about valid users. The report that Dave posted wasn't about wrong strscpy() usage
it was about reading 8-bytes from 5-bytes source string. It wasn't about buggy 'count' at all.
So KASAN will warn for perfectly valid code like this:
char dest[16];
strscpy(dest, "12345", sizeof(dest)):

Currently KASAN doesn't complain about strscpy() only because KASAN doesn't complain about unused code.
And after 077d2ba519b2e8bf1ab("replace incorrect strscpy use in FORTIFY_SOURCE") or before the
FORTIFY_SOURCE patch strscpy() is pretty much unused. Only 2 calls in some drivers, plus 3 in tile arch-code.

> (b) havign automated testing that then just changes semantics and
> implementation of what is tested is a bad bad bad idea.
>

I agree, but what choice do we have here?

> So (a) says that we shouldn't need it in the first place, and (b) says
> that we should avoid KASAN changing behavior unless we absolutely
> *have* to.
>

(a) is wrong. I absolutely agree with (b) and I think that this is exactly
the case where we have to do this.

> In fact, I think we should *never* have that kind of "KASAN changes
> semantics". If there is some particular load that is known to be
> problematic for KASAN, we *still* shouldn't change semantics, we
> should just mark that single load as being unchecked by KASAN.

For strscpy() that would mean making the *whole* read from 'src' buffer unchecked by KASAN.

Although, we can always check the 'src' afterwards. But honestly, this looks shitty:

---
lib/string.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index ebbb99c775bd..5624b629bffa 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -23,6 +23,7 @@
#include <linux/string.h>
#include <linux/ctype.h>
#include <linux/kernel.h>
+#include <linux/kasan-checks.h>
#include <linux/export.h>
#include <linux/bug.h>
#include <linux/errno.h>
@@ -202,12 +203,15 @@ ssize_t strscpy(char *dest, const char *src, size_t count)
while (max >= sizeof(unsigned long)) {
unsigned long c, data;

- c = *(unsigned long *)(src+res);
+ c = READ_ONCE_NOCHECK(*(unsigned long *)(src+res));
if (has_zero(c, &data, &constants)) {
data = prep_zero_mask(c, data, &constants);
data = create_zero_mask(data);
*(unsigned long *)(dest+res) = c & zero_bytemask(data);
- return res + find_zero(data);
+
+ res = res + find_zero(data);
+ kasan_check_read(src, res);
+ return res;
}
*(unsigned long *)(dest+res) = c;
res += sizeof(unsigned long);
@@ -215,6 +219,7 @@ ssize_t strscpy(char *dest, const char *src, size_t count)
max -= sizeof(unsigned long);
}

+ kasan_check_read(src, res);
while (count) {
char c;

--
2.13.0

2017-07-18 20:26:20

by Linus Torvalds

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On Tue, Jul 18, 2017 at 1:15 PM, Andrey Ryabinin
<[email protected]> wrote:
>
> No, it does warn about valid users. The report that Dave posted wasn't about wrong strscpy() usage
> it was about reading 8-bytes from 5-bytes source string. It wasn't about buggy 'count' at all.
> So KASAN will warn for perfectly valid code like this:
> char dest[16];
> strscpy(dest, "12345", sizeof(dest)):

Ugh, ok, yes.

> For strscpy() that would mean making the *whole* read from 'src' buffer unchecked by KASAN.

So we do have that READ_ONCE_NOCHECK(), but could we perhaps have
something that doesn't do a NOCHECK but a partial check and is simply
ok with "this is an optimistc longer access"

We have that for the dcache case too, although there the code does
that odd kasan_unpoison_shadow() instead.

Linus

2017-07-18 21:32:04

by Andrey Ryabinin

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On 07/18/2017 11:26 PM, Linus Torvalds wrote:
> On Tue, Jul 18, 2017 at 1:15 PM, Andrey Ryabinin
> <[email protected]> wrote:
>>
>> No, it does warn about valid users. The report that Dave posted wasn't about wrong strscpy() usage
>> it was about reading 8-bytes from 5-bytes source string. It wasn't about buggy 'count' at all.
>> So KASAN will warn for perfectly valid code like this:
>> char dest[16];
>> strscpy(dest, "12345", sizeof(dest)):
>
> Ugh, ok, yes.
>
>> For strscpy() that would mean making the *whole* read from 'src' buffer unchecked by KASAN.
>
> So we do have that READ_ONCE_NOCHECK(), but could we perhaps have
> something that doesn't do a NOCHECK but a partial check and is simply
> ok with "this is an optimistc longer access"
>

This can be dont, I think.

Something like this:
static inline unsigned long read_partial_nocheck(unsigned long *x)
{
unsigned long ret = READ_ONCE_NOCHECK(x);
kasan_check_partial(x, sizeof(unsigned long));
return ret;
}

Where kasan_check_partial() will warn only if the kasan shadow has a negative value.
Positive values 1-7 in the shadow would mean a partial oob access, that should be ignored.

> We have that for the dcache case too, although there the code does
> that odd kasan_unpoison_shadow() instead.
>

Yes, it marks memory as valid, so it can be accessed without warning.
We can do this the same in strscpy(), but the disadvantage of this approach is that
memory becomes accessible for everyone, not just for the code that does optimistic reads.
Thus it would be possible to miss some off-by-ones (quite usual bug for strings).

2017-07-18 21:32:56

by Andrey Ryabinin

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On 07/19/2017 12:31 AM, Andrey Ryabinin wrote:
> On 07/18/2017 11:26 PM, Linus Torvalds wrote:
>> On Tue, Jul 18, 2017 at 1:15 PM, Andrey Ryabinin
>> <[email protected]> wrote:
>>>
>>> No, it does warn about valid users. The report that Dave posted wasn't about wrong strscpy() usage
>>> it was about reading 8-bytes from 5-bytes source string. It wasn't about buggy 'count' at all.
>>> So KASAN will warn for perfectly valid code like this:
>>> char dest[16];
>>> strscpy(dest, "12345", sizeof(dest)):
>>
>> Ugh, ok, yes.
>>
>>> For strscpy() that would mean making the *whole* read from 'src' buffer unchecked by KASAN.
>>
>> So we do have that READ_ONCE_NOCHECK(), but could we perhaps have
>> something that doesn't do a NOCHECK but a partial check and is simply
>> ok with "this is an optimistc longer access"
>>
>
> This can be dont, I think.
s/dont/done

2017-07-18 22:04:44

by Andrew Morton

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On Wed, 19 Jul 2017 00:31:36 +0300 Andrey Ryabinin <[email protected]> wrote:

> On 07/18/2017 11:26 PM, Linus Torvalds wrote:
> > On Tue, Jul 18, 2017 at 1:15 PM, Andrey Ryabinin
> > <[email protected]> wrote:
> >>
> >> No, it does warn about valid users. The report that Dave posted wasn't about wrong strscpy() usage
> >> it was about reading 8-bytes from 5-bytes source string. It wasn't about buggy 'count' at all.
> >> So KASAN will warn for perfectly valid code like this:
> >> char dest[16];
> >> strscpy(dest, "12345", sizeof(dest)):
> >
> > Ugh, ok, yes.
> >
> >> For strscpy() that would mean making the *whole* read from 'src' buffer unchecked by KASAN.
> >
> > So we do have that READ_ONCE_NOCHECK(), but could we perhaps have
> > something that doesn't do a NOCHECK but a partial check and is simply
> > ok with "this is an optimistc longer access"
> >
>
> This can be dont, I think.
>
> Something like this:
> static inline unsigned long read_partial_nocheck(unsigned long *x)
> {
> unsigned long ret = READ_ONCE_NOCHECK(x);
> kasan_check_partial(x, sizeof(unsigned long));
> return ret;
> }
>

(Cc Chris)

We could just remove all that word-at-a-time logic. Do we have any
evidence that this would harm anything?

2017-07-18 22:35:24

by Linus Torvalds

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On Tue, Jul 18, 2017 at 3:04 PM, Andrew Morton
<[email protected]> wrote:
>
> We could just remove all that word-at-a-time logic. Do we have any
> evidence that this would harm anything?

One option would be to simply make reads more permissive, and only
check the first byte of the read (unlike the last, which is what it
does now). Obviously within the normal KASAN granularity level (which
I think is 8 byte aligned anyway).

Checking *writes* more carefully is obviously a good idea regardless.
But the whole "do big reads" is pretty common for a lot of memory and
string copies.

So maybe the whole KASAN_SHADOW_MASK thing could be limited to just
the write side?

That would certainly simplify things. How much it would hurt coverage
would be something that the people who have analyzed KASAN reports so
far would have to judge. Have any of the existing KASAN reports been
due to the KASAN_SHADOW_SCALE level read tests?

Linus

2017-07-19 07:46:37

by Dmitry Vyukov

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On Wed, Jul 19, 2017 at 12:35 AM, Linus Torvalds
<[email protected]> wrote:
> On Tue, Jul 18, 2017 at 3:04 PM, Andrew Morton
> <[email protected]> wrote:
>>
>> We could just remove all that word-at-a-time logic. Do we have any
>> evidence that this would harm anything?
>
> One option would be to simply make reads more permissive, and only
> check the first byte of the read (unlike the last, which is what it
> does now). Obviously within the normal KASAN granularity level (which
> I think is 8 byte aligned anyway).
>
> Checking *writes* more carefully is obviously a good idea regardless.
> But the whole "do big reads" is pretty common for a lot of memory and
> string copies.
>
> So maybe the whole KASAN_SHADOW_MASK thing could be limited to just
> the write side?
>
> That would certainly simplify things. How much it would hurt coverage
> would be something that the people who have analyzed KASAN reports so
> far would have to judge. Have any of the existing KASAN reports been
> due to the KASAN_SHADOW_SCALE level read tests?

We've seen a bunch of such out-of-bounds KASAN reports -- on strings,
network packets, other binary data. I would be very much against
relaxing the general C rules for the two cases of formally incorrect C
code.

I see 3 ways out:

1. What Andrew mentioned -- don't do out of bounds reads. Solves the
problem with perfectly clean code.
I guess now people may ask for benchmarks for the change that removes
the optimization. But looking at the change that introduced it, it was
never benchmarked in the first place. And the description actually
explicitly says that we do this just-in-case and that this is not
performance critical:

- The implementation should be reasonably performant on all
platforms since it uses the asm/word-at-a-time.h API rather than
simple byte copy. Kernel-to-kernel string copy is not considered
to be performance critical in any case.

2. Assuming we want to have both out-of-bounds reads and KASAN, deal
with it on a case-by-base basis.
For strscpy, not checking accesses at all looks like the worst option
because we will miss off-by-page accesses and use-after-free's (as
Andrey already mentioned).
Introducing read_partial_nocheck looks like the second worst, because
it introduces a one off interface routine.
Between setting max=0 and using READ_ONCE_NOCHECK+kasan_check_read I
don't have a strong preference. Setting max=0 looks simpler and less
intrusive, though.

And, yes, what we do in dcache looks worse because (1), it's
higher-level code rather than a low-level routine (and we generally
don't want to sprinkle KASAN-anything in high-level code) and (2)
unpoisoning the trailer unpoisons it for all accesses (even the ones
that unintentionally access out-of-bounds).

3. Adapt kernel memory protection model. Currently it assumes that an
arch provides protection only on page granularity. But KASAN is
effectively an arch with ultimate segmentation where each heap block
gets an own segment with precise size (equivalent to the abstract C
machine as well). So we could introduce PROTECTION_GRANULARITY
constant which is usually PAGE_SIZE, but 1 for KASAN. Then we can
write clean code based on this model. But I am not sure we want to go
this route while we have only 2 cases. And effectively we will still
have different code executed under KASAN.

2017-07-19 15:39:49

by Chris Metcalf

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On 7/18/2017 6:04 PM, Andrew Morton wrote:
> On Wed, 19 Jul 2017 00:31:36 +0300 Andrey Ryabinin <[email protected]> wrote:
>
>> On 07/18/2017 11:26 PM, Linus Torvalds wrote:
>>> On Tue, Jul 18, 2017 at 1:15 PM, Andrey Ryabinin
>>> <[email protected]> wrote:
>>>> No, it does warn about valid users. The report that Dave posted wasn't about wrong strscpy() usage
>>>> it was about reading 8-bytes from 5-bytes source string. It wasn't about buggy 'count' at all.
>>>> So KASAN will warn for perfectly valid code like this:
>>>> char dest[16];
>>>> strscpy(dest, "12345", sizeof(dest)):
>>> Ugh, ok, yes.
>>>
>>>> For strscpy() that would mean making the *whole* read from 'src' buffer unchecked by KASAN.
>>> So we do have that READ_ONCE_NOCHECK(), but could we perhaps have
>>> something that doesn't do a NOCHECK but a partial check and is simply
>>> ok with "this is an optimistc longer access"
>>>
>> This can be dont, I think.
>>
>> Something like this:
>> static inline unsigned long read_partial_nocheck(unsigned long *x)
>> {
>> unsigned long ret = READ_ONCE_NOCHECK(x);
>> kasan_check_partial(x, sizeof(unsigned long));
>> return ret;
>> }
>>
> (Cc Chris)
>
> We could just remove all that word-at-a-time logic. Do we have any
> evidence that this would harm anything?

The word-at-a-time logic was part of the initial commit since I wanted
to ensure that strscpy could be used to replace strlcpy or strncpy without
serious concerns about performance. It seems unfortunate to remove it
unconditionally to support KASAN, but I haven't looked deeply at the
tradeoffs here.

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

2017-07-19 16:05:55

by Dave Jones

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On Wed, Jul 19, 2017 at 11:39:32AM -0400, Chris Metcalf wrote:

> > We could just remove all that word-at-a-time logic. Do we have any
> > evidence that this would harm anything?
>
> The word-at-a-time logic was part of the initial commit since I wanted
> to ensure that strscpy could be used to replace strlcpy or strncpy without
> serious concerns about performance.

I'm curious what the typical length of the strings we're concerned about
in this case are if this makes a difference.

Dave

2017-07-26 12:06:19

by Dmitry Vyukov

[permalink] [raw]

Subject: Re: [PATCH] lib/strscpy: avoid KASAN false positive

On Wed, Jul 19, 2017 at 6:05 PM, Dave Jones <[email protected]> wrote:
> On Wed, Jul 19, 2017 at 11:39:32AM -0400, Chris Metcalf wrote:
>
> > > We could just remove all that word-at-a-time logic. Do we have any
> > > evidence that this would harm anything?
> >
> > The word-at-a-time logic was part of the initial commit since I wanted
> > to ensure that strscpy could be used to replace strlcpy or strncpy without
> > serious concerns about performance.
>
> I'm curious what the typical length of the strings we're concerned about
> in this case are if this makes a difference.

My vote is for proceeding with the original Andrey's patch. It's not
perfect, but it's simple, short, minimally intrusive and fixes the
problem at hand. We can do something more fundamental when/if we have
more such cases.