Hi,
We've fixed a bug with sscanf(). When passing in a 16-digit hex-string like so:
sscanf("fafafafa0b0b0b0b", "%8x%8x", &hi, &lo)
The resulting value in hi is always 0. This is because vsscanf() uses the
obsolete and broken functions simple_strtoul() and simple_strtoull(), which we
have replaced.
Many thanks,
Thomas
Thomas Preston (2):
vsprintf: Specify type for union val members
vsprintf: Stop using obsolete simple_strtoul()
lib/vsprintf.c | 66 ++++++++++++++++++++++++++++++++++++++--------------------
1 file changed, 43 insertions(+), 23 deletions(-)
--
2.11.0
The vsscanf function uses implicit type casting to populate the long
long integers in union val from the obsolete functions simple_strtol()
and simple_strtoll(). However the new functions kstrtol() and kstrtoll()
expect a pointer to the correct type, so we must be explicit about which
type we are using.
Signed-off-by: Thomas Preston <[email protected]>
---
lib/vsprintf.c | 34 +++++++++++++++++-----------------
1 file changed, 17 insertions(+), 17 deletions(-)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 37a54a6dd594..bbf2ac734711 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2914,8 +2914,8 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
u8 qualifier;
unsigned int base;
union {
- long long s;
- unsigned long long u;
+ long long sll;
+ unsigned long long ull;
} val;
s16 field_width;
bool is_sign;
@@ -3120,11 +3120,11 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
break;
if (is_sign)
- val.s = qualifier != 'L' ?
+ val.sll = qualifier != 'L' ?
simple_strtol(str, &next, base) :
simple_strtoll(str, &next, base);
else
- val.u = qualifier != 'L' ?
+ val.ull = qualifier != 'L' ?
simple_strtoul(str, &next, base) :
simple_strtoull(str, &next, base);
@@ -3133,9 +3133,9 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
_parse_integer_fixup_radix(str, &base);
while (next - str > field_width) {
if (is_sign)
- val.s = div_s64(val.s, base);
+ val.sll = div_s64(val.sll, base);
else
- val.u = div_u64(val.u, base);
+ val.ull = div_u64(val.ull, base);
--next;
}
}
@@ -3143,36 +3143,36 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
switch (qualifier) {
case 'H': /* that's 'hh' in format */
if (is_sign)
- *va_arg(args, signed char *) = val.s;
+ *va_arg(args, signed char *) = val.sll;
else
- *va_arg(args, unsigned char *) = val.u;
+ *va_arg(args, unsigned char *) = val.ull;
break;
case 'h':
if (is_sign)
- *va_arg(args, short *) = val.s;
+ *va_arg(args, short *) = val.sll;
else
- *va_arg(args, unsigned short *) = val.u;
+ *va_arg(args, unsigned short *) = val.ull;
break;
case 'l':
if (is_sign)
- *va_arg(args, long *) = val.s;
+ *va_arg(args, long *) = val.sll;
else
- *va_arg(args, unsigned long *) = val.u;
+ *va_arg(args, unsigned long *) = val.ull;
break;
case 'L':
if (is_sign)
- *va_arg(args, long long *) = val.s;
+ *va_arg(args, long long *) = val.sll;
else
- *va_arg(args, unsigned long long *) = val.u;
+ *va_arg(args, unsigned long long *) = val.ull;
break;
case 'z':
- *va_arg(args, size_t *) = val.u;
+ *va_arg(args, size_t *) = val.ull;
break;
default:
if (is_sign)
- *va_arg(args, int *) = val.s;
+ *va_arg(args, int *) = val.sll;
else
- *va_arg(args, unsigned int *) = val.u;
+ *va_arg(args, unsigned int *) = val.ull;
break;
}
num++;
--
2.11.0
Stop using the obsolete functions simple_strtoul() and
simple_strtoull(). Instead, we should use the improved kstrtol() and
kstrtoll() functions. To do this, we must copy the current field into a
null-terminated tmpstr and advance the variable `next` manually.
The width of tmpstr has been chosen because no integer field can be
larger than 22 characters (ULLONG_MAX in octal), plus sign, radix,
new-line, null-terminator and some extra for alignment.
This patch fixes a bug with vsscanf. If passing sscan a 16-digit
hex-string like so:
sscanf("fafafafa0b0b0b0b", "%8x%8x", &hi, &lo)
then the result comes out with hi always being 0.
The issue is that the code calls simple_strtoul() which consumes up
to 16-digits but only returns an unsigned long (8 hex digits on ARM).
The vsscanf() code then checks and finds that the field_width of 8 was
greater than the 16 consumed characters and tries to fix this by:
while (next - str > field_width) {
if (is_sign)
val.s = div_s64(val.s, base);
else
val.u = div_u64(val.u, base);
--next;
}
However val.{s,u} is already trunacted from the simple_strtoul call
and all that happens is the value gets divided down to zero.
Signed-off-by: Thomas Preston <[email protected]>
Signed-off-by: Ben Dooks <[email protected]>
---
lib/vsprintf.c | 36 ++++++++++++++++++++++++++++--------
1 file changed, 28 insertions(+), 8 deletions(-)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index bbf2ac734711..ec23e18e8cc6 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -47,6 +47,8 @@
#include <linux/string_helpers.h>
#include "kstrtox.h"
+#define INT_BUF_LEN 28
+
/**
* simple_strtoull - convert a string to an unsigned long long
* @cp: The start of the string
@@ -2914,7 +2916,9 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
u8 qualifier;
unsigned int base;
union {
+ long sl;
long long sll;
+ unsigned long ul;
unsigned long long ull;
} val;
s16 field_width;
@@ -3119,14 +3123,30 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
|| (base == 0 && !isdigit(digit)))
break;
- if (is_sign)
- val.sll = qualifier != 'L' ?
- simple_strtol(str, &next, base) :
- simple_strtoll(str, &next, base);
- else
- val.ull = qualifier != 'L' ?
- simple_strtoul(str, &next, base) :
- simple_strtoull(str, &next, base);
+ if (unlikely((field_width+1) > INT_BUF_LEN))
+ return num;
+
+ if (field_width > 0) {
+ char tmpstr[INT_BUF_LEN];
+ int ret;
+
+ strscpy(tmpstr, str, field_width+1);
+
+ if (is_sign)
+ if (qualifier != 'L')
+ ret = kstrtol(tmpstr, base, &val.sl);
+ else
+ ret = kstrtoll(tmpstr, base, &val.sll);
+ else
+ if (qualifier != 'L')
+ ret = kstrtoul(tmpstr, base, &val.ul);
+ else
+ ret = kstrtoull(tmpstr, base, &val.ull);
+ if (ret < 0)
+ return num;
+
+ }
+ next = (char *)str + field_width;
if (field_width > 0 && next - str > field_width) {
if (base == 0)
--
2.11.0
On Tue, Dec 11, 2018 at 7:21 AM Thomas Preston
<[email protected]> wrote:
>
> Stop using the obsolete functions simple_strtoul() and
> simple_strtoull(). Instead, we should use the improved kstrtol() and
> kstrtoll() functions. To do this, we must copy the current field into a
> null-terminated tmpstr and advance the variable `next` manually.
I see what you're trying to do, but this fix is much much worse than
the bug was.
> + if (field_width > 0) {
> + char tmpstr[INT_BUF_LEN];
> + int ret;
> +
> + strscpy(tmpstr, str, field_width+1);
If field_width is larger than INT_BUF_LEN, you are now corrupting kernel stack.
And no, you can't fix it by limiting field_width, since a large
field_width is quite possible and might even be valid - and still fit
in an int. Maybe the number is
000000000000000000000001
or something?
A fix might be to skip leading zeroes.
Honestly, just do it by hand. Don't use kstrol and friends at all.
Just do something like
unsigned long long val = 0;
p = str;
for (;;) {
int c;
if (field_width > 0 && p - str >= field_width)
break;
c = hexval(*p++);
if (c < 0 || c > base)
break;
val = val * base + c;
// check for overflow
}
/* Now do "sign" and range checking on val */
/* Ta-daa, all done */
or similar. Treat the above as pseudo-code, I didn't fill in all the details.
Linus
On Tue, Dec 11, 2018 at 09:22:22AM -0800, Linus Torvalds wrote:
> On Tue, Dec 11, 2018 at 7:21 AM Thomas Preston
> <[email protected]> wrote:
> >
> > Stop using the obsolete functions simple_strtoul() and
> > simple_strtoull(). Instead, we should use the improved kstrtol() and
> > kstrtoll() functions. To do this, we must copy the current field into a
> > null-terminated tmpstr and advance the variable `next` manually.
>
> I see what you're trying to do, but this fix is much much worse than
> the bug was.
>
> > + if (field_width > 0) {
> > + char tmpstr[INT_BUF_LEN];
> > + int ret;
> > +
> > + strscpy(tmpstr, str, field_width+1);
>
> If field_width is larger than INT_BUF_LEN, you are now corrupting kernel stack.
>
> And no, you can't fix it by limiting field_width, since a large
> field_width is quite possible and might even be valid - and still fit
> in an int. Maybe the number is
>
> 000000000000000000000001
>
> or something?
>
> A fix might be to skip leading zeroes.
>
> Honestly, just do it by hand. Don't use kstrol and friends at all.
> Just do something like
>
> unsigned long long val = 0;
> p = str;
> for (;;) {
> int c;
> if (field_width > 0 && p - str >= field_width)
> break;
> c = hexval(*p++);
> if (c < 0 || c > base)
> break;
> val = val * base + c;
> // check for overflow
I think it's slightly more complicated, I run the following test case on glibc:
uint32_t hi, lo, t;
sscanf("00fafafafa0d0b0b0b0c000000", "%8x%8x%x", &hi, &lo, &t);
64-bit:
HI: 00fafafa LO: fa0d0b0b (c000000)
32-bit:
HI: 00fafafa LO: fa0d0b0b (ffffffff)
> }
> /* Now do "sign" and range checking on val */
> /* Ta-daa, all done */
>
> or similar. Treat the above as pseudo-code, I didn't fill in all the details.
>
> Linus
--
With Best Regards,
Andy Shevchenko
On Tue, Dec 11, 2018 at 10:05 AM Andy Shevchenko
<[email protected]> wrote:
>
> I think it's slightly more complicated, I run the following test case on glibc:
>
> uint32_t hi, lo, t;
>
> sscanf("00fafafafa0d0b0b0b0c000000", "%8x%8x%x", &hi, &lo, &t);
>
> 64-bit:
> HI: 00fafafa LO: fa0d0b0b (c000000)
> 32-bit:
> HI: 00fafafa LO: fa0d0b0b (ffffffff)
But that's exactly the values my pseudo-code gets (well, my
"pseudo-code obviously just said
// Now do "sign" and range checking on val
The three sub-parts are: "00fafafa" "fa0d0b0b" and "0b0c000000"
and the third one encounters an overflow in "long" on 32-bit, so it
turns into ~0.
And yes, the 64-bit "long" in that third value gets truncated to
"uint32" when written to "t" (which is wht that "0b" part just gets
lost.
And that's just because of historical C scanf behavior. There's no
overflow checking in "int". Only in "long" and "long long".
Linus
On Tue, Dec 11, 2018 at 10:19:08AM -0800, Linus Torvalds wrote:
> On Tue, Dec 11, 2018 at 10:05 AM Andy Shevchenko
> <[email protected]> wrote:
> >
> > I think it's slightly more complicated, I run the following test case on glibc:
> >
> > uint32_t hi, lo, t;
> >
> > sscanf("00fafafafa0d0b0b0b0c000000", "%8x%8x%x", &hi, &lo, &t);
> >
> > 64-bit:
> > HI: 00fafafa LO: fa0d0b0b (c000000)
> > 32-bit:
> > HI: 00fafafa LO: fa0d0b0b (ffffffff)
>
> But that's exactly the values my pseudo-code gets (well, my
> "pseudo-code obviously just said
>
> // Now do "sign" and range checking on val
>
> The three sub-parts are: "00fafafa" "fa0d0b0b" and "0b0c000000"
>
> and the third one encounters an overflow in "long" on 32-bit, so it
> turns into ~0.
>
> And yes, the 64-bit "long" in that third value gets truncated to
> "uint32" when written to "t" (which is wht that "0b" part just gets
> lost.
>
> And that's just because of historical C scanf behavior. There's no
> overflow checking in "int". Only in "long" and "long long".
Yes, that's what my test case showed.
So, whatever implementation would be done, I think it is a good time
to introduce some test cases.
P.S. I have briefly checked and didn't find a single test case for scanf().
Maybe I looked to the wrong module(s), but test_printf.c might be more or less a good choice.
--
With Best Regards,
Andy Shevchenko