2024-01-05 02:35:44

by Qu Wenruo

[permalink] [raw]
Subject: [PATCH v4 0/4] kstrtox: introduce memparse_safe()

[CHANGELOG]
v4:
- Extra test cases for supported but not enabled suffixes

- Comments update for memparse_safe() function

v3:
- Fix the 32bit pointer pattern in the test case
The old pointer pattern for 32 bit systems is in fact 40 bits,
which would still lead to sparse warning.
The newer pattern is using UINTPTR_MAX to trim the pattern, then
converted to a pointer, which should not cause any trimmed bits and
make sparse happy.

v2:
- Make _parse_integer_fixup_radix() to always treat "0x" as hex
This is to make sure invalid strings like "0x" or "0xG" to fail
as expected for memparse_safe().
Or they would only parse the first 0, then leaving "x" for caller
to handle.

- Update the test case to include above failure cases
This including:
* "0x", just hex prefix without any suffix/follow up chars
* "0xK", just hex prefix and a stray suffix
* "0xY", hex prefix with an invalid char

- Fix a bug in btrfs' conversion to memparse_safe()
Where I forgot to delete the old memparse() line.

- Fix a compiler warning on m68K
On that platform, a pointer (32 bits) is smaller than unsigned long long
(64 bits), which can cause static checker to warn.


Qu Wenruo (4):
kstrtox: always skip the leading "0x" even if no more valid chars
kstrtox: introduce a safer version of memparse()
kstrtox: add unit tests for memparse_safe()
btrfs: migrate to the newer memparse_safe() helper

arch/x86/boot/string.c | 2 +-
fs/btrfs/ioctl.c | 6 +-
fs/btrfs/super.c | 9 +-
fs/btrfs/sysfs.c | 14 ++-
include/linux/kernel.h | 8 +-
include/linux/kstrtox.h | 15 +++
lib/cmdline.c | 4 +-
lib/kstrtox.c | 101 ++++++++++++++++-
lib/test-kstrtox.c | 244 ++++++++++++++++++++++++++++++++++++++++
9 files changed, 394 insertions(+), 9 deletions(-)

--
2.43.0



2024-01-05 02:36:02

by Qu Wenruo

[permalink] [raw]
Subject: [PATCH v4 1/4] kstrtox: always skip the leading "0x" even if no more valid chars

Currently the invalid string "0x" (only hex prefix, no valid chars
followed) would make _parse_integer_fixup_radix() to treat it as octal.

This is due to the fact that the function would only auto-detect hex if
and only if there is any valid hex char after "0x".
Or it would only go octal instead.

Thankfully this won't affect our unit test, as "0x" would still be
treated as failure (-EINVAL) anyway:

- Old code treats "0x" as '0' with tailing 'x'
Thus return -EINVAL due to the tailing 'x'.

- New code treats "0x" as "0x" suffix with nothing following up
Thus return -EINVAL due to no valid string.

But for the incoming memparse_safe(), the remaining string would still
be consumed by the caller, and we need to properly treat "0x" as an
invalid string.

So this patch would make _parse_integer_fixup_radix() to forcefully go
hex no matter if there is any valid char following.

And there is a also copy of _parse_integer_fixup_radix() inside
arch/x86/boot/string.c, to keep the code consistent this patch would
also modify that copy.

Thankfully for that copy in arch/x86/boot/string.c, it's only doing
kstrtoll(), thus there would be no behavior change, just as explained
above.

Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Disseldorp <[email protected]>
---
arch/x86/boot/string.c | 2 +-
lib/kstrtox.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/string.c b/arch/x86/boot/string.c
index 1c8541ae3b3a..49750ef697bb 100644
--- a/arch/x86/boot/string.c
+++ b/arch/x86/boot/string.c
@@ -233,7 +233,7 @@ static const char *_parse_integer_fixup_radix(const char *s, unsigned int *base)
{
if (*base == 0) {
if (s[0] == '0') {
- if (_tolower(s[1]) == 'x' && isxdigit(s[2]))
+ if (_tolower(s[1]) == 'x')
*base = 16;
else
*base = 8;
diff --git a/lib/kstrtox.c b/lib/kstrtox.c
index d586e6af5e5a..41c9a499bbf3 100644
--- a/lib/kstrtox.c
+++ b/lib/kstrtox.c
@@ -27,7 +27,7 @@ const char *_parse_integer_fixup_radix(const char *s, unsigned int *base)
{
if (*base == 0) {
if (s[0] == '0') {
- if (_tolower(s[1]) == 'x' && isxdigit(s[2]))
+ if (_tolower(s[1]) == 'x')
*base = 16;
else
*base = 8;
--
2.43.0


2024-01-05 02:36:36

by Qu Wenruo

[permalink] [raw]
Subject: [PATCH v4 2/4] kstrtox: introduce a safer version of memparse()

[BUGS]
Function memparse() lacks error handling:

- If no valid number string at all
In that case @retptr would just be updated and return value would be
zero.

- No overflown detection
This applies to both the number string part, and the suffixes part.
And since we have no way to indicate errors, we can get weird results
like:

"25E" -> 10376293541461622784 (9E)

This is due to the fact that for "E" suffix, there is only 4 bits
left, and 25 with 60 bits left shift would lead to overflow.

[CAUSE]
The root cause is already mentioned in the comments of the function, the
usage of simple_strtoull() is the source of evil.
Furthermore the function prototype is no good either, just returning an
unsigned long long gives us no way to indicate an error.

[FIX]
Due to the prototype limits, we can not have a drop-in replacement for
memparse().

This patch can only help by introduce a new helper, memparse_safe(), and
mark the old memparse() deprecated.

The new memparse_safe() has the following improvement:

- Invalid string detection
If no number string can be detected at all, -EINVAL would be returned.

- Better overflow detection
Both the string part and the extra left shift would have overflow
detection.
Any overflow would result -ERANGE.

- Safer default suffix selection
The helper allows the caller to choose the suffixes that they want to
use.
But only "KMGTP" are recommended by default since the "E" leaves only
4 bits before overflow.
For those callers really know what they are doing, they can still
manually to include all suffixes.

Due to the prototype change, callers should migrate to the new one and
change their code and add extra error handling.

Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Disseldorp <[email protected]>
---
include/linux/kernel.h | 8 +++-
include/linux/kstrtox.h | 15 +++++++
lib/cmdline.c | 4 +-
lib/kstrtox.c | 99 +++++++++++++++++++++++++++++++++++++++++
4 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d9ad21058eed..b1b6da60ea43 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -201,7 +201,13 @@ void do_exit(long error_code) __noreturn;

extern int get_option(char **str, int *pint);
extern char *get_options(const char *str, int nints, int *ints);
-extern unsigned long long memparse(const char *ptr, char **retptr);
+
+/*
+ * DEPRECATED, lack of any kind of error handling.
+ *
+ * Use memparse_safe() from lib/kstrtox.c instead.
+ */
+extern __deprecated unsigned long long memparse(const char *ptr, char **retptr);
extern bool parse_option_str(const char *str, const char *option);
extern char *next_arg(char *args, char **param, char **val);

diff --git a/include/linux/kstrtox.h b/include/linux/kstrtox.h
index 7fcf29a4e0de..53a1e059dd31 100644
--- a/include/linux/kstrtox.h
+++ b/include/linux/kstrtox.h
@@ -9,6 +9,21 @@
int __must_check _kstrtoul(const char *s, unsigned int base, unsigned long *res);
int __must_check _kstrtol(const char *s, unsigned int base, long *res);

+enum memparse_suffix {
+ MEMPARSE_SUFFIX_K = 1 << 0,
+ MEMPARSE_SUFFIX_M = 1 << 1,
+ MEMPARSE_SUFFIX_G = 1 << 2,
+ MEMPARSE_SUFFIX_T = 1 << 3,
+ MEMPARSE_SUFFIX_P = 1 << 4,
+ MEMPARSE_SUFFIX_E = 1 << 5,
+};
+
+#define MEMPARSE_SUFFIXES_DEFAULT (MEMPARSE_SUFFIX_K | MEMPARSE_SUFFIX_M |\
+ MEMPARSE_SUFFIX_G | MEMPARSE_SUFFIX_T |\
+ MEMPARSE_SUFFIX_P)
+
+int __must_check memparse_safe(const char *s, enum memparse_suffix suffixes,
+ unsigned long long *res, char **retptr);
int __must_check kstrtoull(const char *s, unsigned int base, unsigned long long *res);
int __must_check kstrtoll(const char *s, unsigned int base, long long *res);

diff --git a/lib/cmdline.c b/lib/cmdline.c
index 90ed997d9570..35dbb03b5592 100644
--- a/lib/cmdline.c
+++ b/lib/cmdline.c
@@ -139,10 +139,12 @@ char *get_options(const char *str, int nints, int *ints)
EXPORT_SYMBOL(get_options);

/**
- * memparse - parse a string with mem suffixes into a number
+ * memparse - DEPRECATED, parse a string with mem suffixes into a number
* @ptr: Where parse begins
* @retptr: (output) Optional pointer to next char after parse completes
*
+ * There is no way to handle errors, and no overflown detection and string
+ * sanity checks.
* Parses a string into a number. The number stored at @ptr is
* potentially suffixed with K, M, G, T, P, E.
*/
diff --git a/lib/kstrtox.c b/lib/kstrtox.c
index 41c9a499bbf3..375c7f0842e3 100644
--- a/lib/kstrtox.c
+++ b/lib/kstrtox.c
@@ -113,6 +113,105 @@ static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res)
return 0;
}

+/**
+ * memparse_safe - convert a string to an unsigned long long, safer version of
+ * memparse()
+ *
+ * @s: The start of the string. Must be null-terminated.
+ * The base is determined automatically, if it starts with "0x"
+ * the base is 16, if it starts with "0" the base is 8, otherwise
+ * the base is 10.
+ * After a valid number string, there can be at most one
+ * case-insensitive suffix character, specified by the @suffixes
+ * parameter.
+ *
+ * @suffixes: The suffixes which should be handled. Use logical ORed
+ * memparse_suffix enum to indicate the supported suffixes.
+ * The suffixes are case-insensitive, all 2 ^ 10 based.
+ * Supported ones are "KMGPTE".
+ * If one suffix (one of "KMGPTE") is hit but that suffix is
+ * not specified in the @suffxies parameter, it ends the parse
+ * normally, with @retptr pointed to the (unsupported) suffix.
+ * E.g. "68k" with suffxies "M" returns 68 decimal, @retptr
+ * updated to 'k'.
+ *
+ * @res: Where to write the result.
+ *
+ * @retptr: (output) Optional pointer to the next char after parse completes.
+ *
+ * Returns:
+ * * %0 if any valid numeric string can be parsed, and @retptr is updated.
+ * * %-EINVAL if no valid number string can be found.
+ * * %-ERANGE if the number overflows.
+ * * For negative return values, @retptr is not updated.
+ */
+noinline int memparse_safe(const char *s, enum memparse_suffix suffixes,
+ unsigned long long *res, char **retptr)
+{
+ unsigned long long value;
+ unsigned int rv;
+ int shift = 0;
+ int base = 0;
+
+ s = _parse_integer_fixup_radix(s, &base);
+ rv = _parse_integer(s, base, &value);
+ if (rv & KSTRTOX_OVERFLOW)
+ return -ERANGE;
+ if (rv == 0)
+ return -EINVAL;
+
+ s += rv;
+ switch (*s) {
+ case 'K':
+ case 'k':
+ if (!(suffixes & MEMPARSE_SUFFIX_K))
+ break;
+ shift = 10;
+ break;
+ case 'M':
+ case 'm':
+ if (!(suffixes & MEMPARSE_SUFFIX_M))
+ break;
+ shift = 20;
+ break;
+ case 'G':
+ case 'g':
+ if (!(suffixes & MEMPARSE_SUFFIX_G))
+ break;
+ shift = 30;
+ break;
+ case 'T':
+ case 't':
+ if (!(suffixes & MEMPARSE_SUFFIX_T))
+ break;
+ shift = 40;
+ break;
+ case 'P':
+ case 'p':
+ if (!(suffixes & MEMPARSE_SUFFIX_P))
+ break;
+ shift = 50;
+ break;
+ case 'E':
+ case 'e':
+ if (!(suffixes & MEMPARSE_SUFFIX_E))
+ break;
+ shift = 60;
+ break;
+ }
+ if (shift) {
+ s++;
+ if (value >> (64 - shift))
+ return -ERANGE;
+ value <<= shift;
+ }
+ *res = value;
+ if (retptr)
+ *retptr = (char *)s;
+ return 0;
+}
+EXPORT_SYMBOL(memparse_safe);
+
/**
* kstrtoull - convert a string to an unsigned long long
* @s: The start of the string. The string must be null-terminated, and may also
--
2.43.0


2024-01-05 02:36:56

by Qu Wenruo

[permalink] [raw]
Subject: [PATCH v4 3/4] kstrtox: add unit tests for memparse_safe()

The new tests cases for memparse_safe() include:

- The existing test cases for kstrtoull()
Including all the 3 bases (8, 10, 16), and all the ok and failure
cases.
Although there are something we need to verify specific for
memparse_safe():

* @retptr and @value are not modified for failure cases

* return value are correct for failure cases

* @retptr is correct for the good cases

- New test cases
Not only testing the result value, but also the @retptr, including:

* good cases with extra tailing chars, but without valid prefix
The @retptr should point to the first char after a valid string.
3 cases for all the 3 bases.

* good cases with extra tailing chars, with valid prefix
5 cases for all the suffixes.

* good cases with valid but disabled suffixes
The same 5 cases as the previous one, just disable the corresponding
suffixes.

* bad cases without any number but stray suffix
Should be rejected with -EINVAL

Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Disseldorp <[email protected]>
---
lib/test-kstrtox.c | 244 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 244 insertions(+)

diff --git a/lib/test-kstrtox.c b/lib/test-kstrtox.c
index f355f67169b6..8f52b53c1374 100644
--- a/lib/test-kstrtox.c
+++ b/lib/test-kstrtox.c
@@ -268,6 +268,247 @@ static void __init test_kstrtoll_ok(void)
TEST_OK(kstrtoll, long long, "%lld", test_ll_ok);
}

+/*
+ * The special pattern to make sure the result is not modified for error cases.
+ */
+#define ULL_PATTERN (0xefefefef7a7a7a7aULL)
+#define POINTER_PATTERN ((void *)(uintptr_t)(ULL_PATTERN & UINTPTR_MAX))
+
+/* Want to include "E" suffix for full coverage. */
+#define MEMPARSE_TEST_SUFFIX (MEMPARSE_SUFFIX_K | MEMPARSE_SUFFIX_M |\
+ MEMPARSE_SUFFIX_G | MEMPARSE_SUFFIX_T |\
+ MEMPARSE_SUFFIX_P | MEMPARSE_SUFFIX_E)
+
+static void __init test_memparse_safe_fail(void)
+{
+ struct memparse_test_fail {
+ const char *str;
+ /* Expected error number, either -EINVAL or -ERANGE. */
+ unsigned int expected_ret;
+ };
+ static const struct memparse_test_fail tests[] __initconst = {
+ /* No valid string can be found at all. */
+ {"", -EINVAL},
+ {"\n", -EINVAL},
+ {"\n0", -EINVAL},
+ {"+", -EINVAL},
+ {"-", -EINVAL},
+
+ /* Only hex prefix, but no valid string. */
+ {"0x", -EINVAL},
+ {"0X", -EINVAL},
+
+ /* Only hex prefix, with suffix but still no valid string. */
+ {"0xK", -EINVAL},
+ {"0xM", -EINVAL},
+ {"0xG", -EINVAL},
+
+ /* Only hex prefix, with invalid chars. */
+ {"0xH", -EINVAL},
+ {"0xy", -EINVAL},
+
+ /*
+ * No support for any leading "+-" chars, even followed by a valid
+ * number.
+ */
+ {"-0", -EINVAL},
+ {"+0", -EINVAL},
+ {"-1", -EINVAL},
+ {"+1", -EINVAL},
+
+ /* Stray suffix would also be rejected. */
+ {"K", -EINVAL},
+ {"P", -EINVAL},
+
+ /* Overflow in the string itself*/
+ {"18446744073709551616", -ERANGE},
+ {"02000000000000000000000", -ERANGE},
+ {"0x10000000000000000", -ERANGE},
+
+ /*
+ * Good string but would overflow with suffix.
+ *
+ * Note, for "E" suffix, one should not use with hex, or "0x1E"
+ * would be treated as 0x1e (30 in decimal), not 0x1 and "E" suffix.
+ * Another reason "E" suffix is cursed.
+ */
+ {"16E", -ERANGE},
+ {"020E", -ERANGE},
+ {"16384P", -ERANGE},
+ {"040000P", -ERANGE},
+ {"16777216T", -ERANGE},
+ {"0100000000T", -ERANGE},
+ {"17179869184G", -ERANGE},
+ {"0200000000000G", -ERANGE},
+ {"17592186044416M", -ERANGE},
+ {"0400000000000000M", -ERANGE},
+ {"18014398509481984K", -ERANGE},
+ {"01000000000000000000K", -ERANGE},
+ };
+ unsigned int i;
+
+ for_each_test(i, tests) {
+ const struct memparse_test_fail *t = &tests[i];
+ unsigned long long tmp = ULL_PATTERN;
+ char *retptr = POINTER_PATTERN;
+ int ret;
+
+ ret = memparse_safe(t->str, MEMPARSE_TEST_SUFFIX, &tmp, &retptr);
+ if (ret != t->expected_ret) {
+ WARN(1, "str '%s', expected ret %d got %d\n", t->str,
+ t->expected_ret, ret);
+ continue;
+ }
+ if (tmp != ULL_PATTERN)
+ WARN(1, "str '%s' failed as expected, but result got modified",
+ t->str);
+ if (retptr != POINTER_PATTERN)
+ WARN(1, "str '%s' failed as expected, but pointer got modified",
+ t->str);
+ }
+}
+
+static void __init test_memparse_safe_ok(void)
+{
+ struct memparse_test_ok {
+ const char *str;
+ unsigned long long expected_value;
+ /* How many bytes the @retptr pointer should be moved forward. */
+ unsigned int retptr_off;
+
+ /* If 0, falls back to MEMPARSE_TEST_SUFFIX. */
+ enum memparse_suffix suffixes;
+ };
+ static DEFINE_TEST_OK(struct memparse_test_ok, tests) = {
+ /*
+ * The same pattern of kstrtoull, just with extra @retptr
+ * verification.
+ */
+ {"0", 0ULL, 1},
+ {"1", 1ULL, 1},
+ {"127", 127ULL, 3},
+ {"128", 128ULL, 3},
+ {"129", 129ULL, 3},
+ {"255", 255ULL, 3},
+ {"256", 256ULL, 3},
+ {"257", 257ULL, 3},
+ {"32767", 32767ULL, 5},
+ {"32768", 32768ULL, 5},
+ {"32769", 32769ULL, 5},
+ {"65535", 65535ULL, 5},
+ {"65536", 65536ULL, 5},
+ {"65537", 65537ULL, 5},
+ {"2147483647", 2147483647ULL, 10},
+ {"2147483648", 2147483648ULL, 10},
+ {"2147483649", 2147483649ULL, 10},
+ {"4294967295", 4294967295ULL, 10},
+ {"4294967296", 4294967296ULL, 10},
+ {"4294967297", 4294967297ULL, 10},
+ {"9223372036854775807", 9223372036854775807ULL, 19},
+ {"9223372036854775808", 9223372036854775808ULL, 19},
+ {"9223372036854775809", 9223372036854775809ULL, 19},
+ {"18446744073709551614", 18446744073709551614ULL, 20},
+ {"18446744073709551615", 18446744073709551615ULL, 20},
+
+ {"00", 00ULL, 2},
+ {"01", 01ULL, 2},
+ {"0177", 0177ULL, 4},
+ {"0200", 0200ULL, 4},
+ {"0201", 0201ULL, 4},
+ {"0377", 0377ULL, 4},
+ {"0400", 0400ULL, 4},
+ {"0401", 0401ULL, 4},
+ {"077777", 077777ULL, 6},
+ {"0100000", 0100000ULL, 7},
+ {"0100001", 0100001ULL, 7},
+ {"0177777", 0177777ULL, 7},
+ {"0200000", 0200000ULL, 7},
+ {"0200001", 0200001ULL, 7},
+ {"017777777777", 017777777777ULL, 12},
+ {"020000000000", 020000000000ULL, 12},
+ {"020000000001", 020000000001ULL, 12},
+ {"037777777777", 037777777777ULL, 12},
+ {"040000000000", 040000000000ULL, 12},
+ {"040000000001", 040000000001ULL, 12},
+ {"0777777777777777777777", 0777777777777777777777ULL, 22},
+ {"01000000000000000000000", 01000000000000000000000ULL, 23},
+ {"01000000000000000000001", 01000000000000000000001ULL, 23},
+ {"01777777777777777777776", 01777777777777777777776ULL, 23},
+ {"01777777777777777777777", 01777777777777777777777ULL, 23},
+
+ {"0x0", 0x0ULL, 3},
+ {"0x1", 0x1ULL, 3},
+ {"0x7f", 0x7fULL, 4},
+ {"0x80", 0x80ULL, 4},
+ {"0x81", 0x81ULL, 4},
+ {"0xff", 0xffULL, 4},
+ {"0x100", 0x100ULL, 5},
+ {"0x101", 0x101ULL, 5},
+ {"0x7fff", 0x7fffULL, 6},
+ {"0x8000", 0x8000ULL, 6},
+ {"0x8001", 0x8001ULL, 6},
+ {"0xffff", 0xffffULL, 6},
+ {"0x10000", 0x10000ULL, 7},
+ {"0x10001", 0x10001ULL, 7},
+ {"0x7fffffff", 0x7fffffffULL, 10},
+ {"0x80000000", 0x80000000ULL, 10},
+ {"0x80000001", 0x80000001ULL, 10},
+ {"0xffffffff", 0xffffffffULL, 10},
+ {"0x100000000", 0x100000000ULL, 11},
+ {"0x100000001", 0x100000001ULL, 11},
+ {"0x7fffffffffffffff", 0x7fffffffffffffffULL, 18},
+ {"0x8000000000000000", 0x8000000000000000ULL, 18},
+ {"0x8000000000000001", 0x8000000000000001ULL, 18},
+ {"0xfffffffffffffffe", 0xfffffffffffffffeULL, 18},
+ {"0xffffffffffffffff", 0xffffffffffffffffULL, 18},
+
+ /* Now with extra non-suffix chars to test @retptr update. */
+ {"1q84", 1, 1},
+ {"02o45", 2, 2},
+ {"0xffvii", 0xff, 4},
+
+ /*
+ * Valid suffix then tailing chars, to test the @retptr
+ * behavior.
+ */
+ {"68k ", 69632, 3},
+ {"8MS", 8388608, 2},
+ {"0xaeGis", 0x2b80000000, 5},
+ {"0xaTx", 0xa0000000000, 4},
+ {"3E8", 0x3000000000000000, 2},
+
+ /* Valid suffixes but not enabled. */
+ {"68k ", 68, 2, MEMPARSE_SUFFIX_M},
+ {"8MS", 8, 1, MEMPARSE_SUFFIX_K},
+ {"0xaeGis", 0xae, 4, MEMPARSE_SUFFIX_K},
+ {"0xaTx", 0xa, 3, MEMPARSE_SUFFIX_K},
+ {"3E8", 3, 1, MEMPARSE_SUFFIX_K},
+ };
+ unsigned int i;
+
+ for_each_test(i, tests) {
+ const struct memparse_test_ok *t = &tests[i];
+ unsigned long long tmp;
+ char *retptr;
+ int ret;
+ enum memparse_suffix suffixes = MEMPARSE_TEST_SUFFIX;
+
+ if (t->suffixes)
+ suffixes = t->suffixes;
+
+ ret = memparse_safe(t->str, suffixes, &tmp, &retptr);
+ if (ret != 0) {
+ WARN(1, "str '%s', expected ret 0 got %d\n", t->str, ret);
+ continue;
+ }
+ if (tmp != t->expected_value)
+ WARN(1, "str '%s' incorrect result, expected %llu got %llu",
+ t->str, t->expected_value, tmp);
+ if (retptr != t->str + t->retptr_off)
+ WARN(1, "str '%s' incorrect endptr, expected %u got %zu",
+ t->str, t->retptr_off, retptr - t->str);
+ }
+}
static void __init test_kstrtoll_fail(void)
{
static DEFINE_TEST_FAIL(test_ll_fail) = {
@@ -710,6 +951,9 @@ static int __init test_kstrtox_init(void)
test_kstrtoll_ok();
test_kstrtoll_fail();

+ test_memparse_safe_ok();
+ test_memparse_safe_fail();
+
test_kstrtou64_ok();
test_kstrtou64_fail();
test_kstrtos64_ok();
--
2.43.0


2024-01-05 02:37:17

by Qu Wenruo

[permalink] [raw]
Subject: [PATCH v4 4/4] btrfs: migrate to the newer memparse_safe() helper

The new helper has better error report and correct overflow detection,
furthermore the old @retptr behavior is also kept, thus there should be
no behavior change.

Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Disseldorp <[email protected]>
---
fs/btrfs/ioctl.c | 6 +++++-
fs/btrfs/super.c | 9 ++++++++-
fs/btrfs/sysfs.c | 14 +++++++++++---
3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 4e50b62db2a8..cb63f50a2078 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1175,7 +1175,11 @@ static noinline int btrfs_ioctl_resize(struct file *file,
mod = 1;
sizestr++;
}
- new_size = memparse(sizestr, &retptr);
+
+ ret = memparse_safe(sizestr, MEMPARSE_SUFFIXES_DEFAULT,
+ &new_size, &retptr);
+ if (ret < 0)
+ goto out_finish;
if (*retptr != '\0' || new_size == 0) {
ret = -EINVAL;
goto out_finish;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 3a677b808f0f..0f29fd692e0f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -263,6 +263,8 @@ static int btrfs_parse_param(struct fs_context *fc, struct fs_parameter *param)
{
struct btrfs_fs_context *ctx = fc->fs_private;
struct fs_parse_result result;
+ /* Only for memparse_safe() caller. */
+ int ret;
int opt;

opt = fs_parse(fc, btrfs_fs_parameters, param, &result);
@@ -400,7 +402,12 @@ static int btrfs_parse_param(struct fs_context *fc, struct fs_parameter *param)
ctx->thread_pool_size = result.uint_32;
break;
case Opt_max_inline:
- ctx->max_inline = memparse(param->string, NULL);
+ ret = memparse_safe(param->string, MEMPARSE_SUFFIXES_DEFAULT,
+ &ctx->max_inline, NULL);
+ if (ret < 0) {
+ btrfs_err(NULL, "invalid string \"%s\"", param->string);
+ return ret;
+ }
break;
case Opt_acl:
if (result.negated) {
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 84c05246ffd8..6846572496a6 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -762,6 +762,7 @@ static ssize_t btrfs_chunk_size_store(struct kobject *kobj,
struct btrfs_fs_info *fs_info = to_fs_info(get_btrfs_kobj(kobj));
char *retptr;
u64 val;
+ int ret;

if (!capable(CAP_SYS_ADMIN))
return -EPERM;
@@ -776,7 +777,10 @@ static ssize_t btrfs_chunk_size_store(struct kobject *kobj,
if (space_info->flags & BTRFS_BLOCK_GROUP_SYSTEM)
return -EPERM;

- val = memparse(buf, &retptr);
+ ret = memparse_safe(buf, MEMPARSE_SUFFIXES_DEFAULT, &val, &retptr);
+ if (ret < 0)
+ return ret;
+
/* There could be trailing '\n', also catch any typos after the value */
retptr = skip_spaces(retptr);
if (*retptr != 0 || val == 0)
@@ -1779,10 +1783,14 @@ static ssize_t btrfs_devinfo_scrub_speed_max_store(struct kobject *kobj,
{
struct btrfs_device *device = container_of(kobj, struct btrfs_device,
devid_kobj);
- char *endptr;
unsigned long long limit;
+ char *endptr;
+ int ret;
+
+ ret = memparse_safe(buf, MEMPARSE_SUFFIXES_DEFAULT, &limit, &endptr);
+ if (ret < 0)
+ return ret;

- limit = memparse(buf, &endptr);
/* There could be trailing '\n', also catch any typos after the value. */
endptr = skip_spaces(endptr);
if (*endptr != 0)
--
2.43.0


2024-01-05 19:04:40

by David Sterba

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] btrfs: migrate to the newer memparse_safe() helper

On Fri, Jan 05, 2024 at 01:05:02PM +1030, Qu Wenruo wrote:
> The new helper has better error report and correct overflow detection,
> furthermore the old @retptr behavior is also kept, thus there should be
> no behavior change.
>
> Signed-off-by: Qu Wenruo <[email protected]>
> Reviewed-by: David Disseldorp <[email protected]>
> ---
> fs/btrfs/ioctl.c | 6 +++++-
> fs/btrfs/super.c | 9 ++++++++-
> fs/btrfs/sysfs.c | 14 +++++++++++---
> 3 files changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 4e50b62db2a8..cb63f50a2078 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1175,7 +1175,11 @@ static noinline int btrfs_ioctl_resize(struct file *file,
> mod = 1;
> sizestr++;
> }
> - new_size = memparse(sizestr, &retptr);
> +
> + ret = memparse_safe(sizestr, MEMPARSE_SUFFIXES_DEFAULT,
> + &new_size, &retptr);
> + if (ret < 0)
> + goto out_finish;
> if (*retptr != '\0' || new_size == 0) {
> ret = -EINVAL;
> goto out_finish;
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 3a677b808f0f..0f29fd692e0f 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -263,6 +263,8 @@ static int btrfs_parse_param(struct fs_context *fc, struct fs_parameter *param)
> {
> struct btrfs_fs_context *ctx = fc->fs_private;
> struct fs_parse_result result;
> + /* Only for memparse_safe() caller. */

Please drop the comment.

2024-01-06 14:35:30

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] kstrtox: introduce memparse_safe()

On Fri, Jan 05, 2024 at 01:04:58PM +1030, Qu Wenruo wrote:
> [CHANGELOG]
> v4:
> - Extra test cases for supported but not enabled suffixes
>
> - Comments update for memparse_safe() function

Same comment is applied as per v3.

--
With Best Regards,
Andy Shevchenko



2024-01-15 04:27:45

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] kstrtox: introduce a safer version of memparse()

Hi,

On 1/4/24 18:35, Qu Wenruo wrote:
> [BUGS]
> Function memparse() lacks error handling:
>
> - If no valid number string at all
> In that case @retptr would just be updated and return value would be
> zero.
>
> - No overflown detection

overflow

> This applies to both the number string part, and the suffixes part.
> And since we have no way to indicate errors, we can get weird results
> like:
>
> "25E" -> 10376293541461622784 (9E)
>
> This is due to the fact that for "E" suffix, there is only 4 bits
> left, and 25 with 60 bits left shift would lead to overflow.
>
> [CAUSE]
> The root cause is already mentioned in the comments of the function, the
> usage of simple_strtoull() is the source of evil.
> Furthermore the function prototype is no good either, just returning an
> unsigned long long gives us no way to indicate an error.
>
> [FIX]
> Due to the prototype limits, we can not have a drop-in replacement for
> memparse().
>
> This patch can only help by introduce a new helper, memparse_safe(), and
> mark the old memparse() deprecated.
>
> The new memparse_safe() has the following improvement:
>
> - Invalid string detection
> If no number string can be detected at all, -EINVAL would be returned.

is returned.

>
> - Better overflow detection
> Both the string part and the extra left shift would have overflow

have overflow

> detection.
> Any overflow would result -ERANGE.

Any overflow results in -ERANGE.

>
> - Safer default suffix selection
> The helper allows the caller to choose the suffixes that they want to
> use.
> But only "KMGTP" are recommended by default since the "E" leaves only
> 4 bits before overflow.
> For those callers really know what they are doing, they can still
> manually to include all suffixes.
>
> Due to the prototype change, callers should migrate to the new one and
> change their code and add extra error handling.
>
> Signed-off-by: Qu Wenruo <[email protected]>
> Reviewed-by: David Disseldorp <[email protected]>
> ---
> include/linux/kernel.h | 8 +++-
> include/linux/kstrtox.h | 15 +++++++
> lib/cmdline.c | 4 +-
> lib/kstrtox.c | 99 +++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 124 insertions(+), 2 deletions(-)
>

> diff --git a/lib/cmdline.c b/lib/cmdline.c
> index 90ed997d9570..35dbb03b5592 100644
> --- a/lib/cmdline.c
> +++ b/lib/cmdline.c
> @@ -139,10 +139,12 @@ char *get_options(const char *str, int nints, int *ints)
> EXPORT_SYMBOL(get_options);
>
> /**
> - * memparse - parse a string with mem suffixes into a number
> + * memparse - DEPRECATED, parse a string with mem suffixes into a number
> * @ptr: Where parse begins
> * @retptr: (output) Optional pointer to next char after parse completes
> *
> + * There is no way to handle errors, and no overflown detection and string
> + * sanity checks.
> * Parses a string into a number. The number stored at @ptr is
> * potentially suffixed with K, M, G, T, P, E.
> */
> diff --git a/lib/kstrtox.c b/lib/kstrtox.c
> index 41c9a499bbf3..375c7f0842e3 100644
> --- a/lib/kstrtox.c
> +++ b/lib/kstrtox.c
> @@ -113,6 +113,105 @@ static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res)
> return 0;
> }
>
> +/**
> + * memparse_safe - convert a string to an unsigned long long, safer version of
> + * memparse()
> + *
> + * @s: The start of the string. Must be null-terminated.

Unless I misunderstand, this is the biggest problem that I see with
memparse_safe(): "Must be null-terminated".
memparse() does not have that requirement.

And how is @retptr updated if the string is null-terminated?

If the "Must be null-terminated." is correct, it requires that every user/caller
first determine the end of the number (how? space and/or any special character
or any alphabetic character that is not in KMGTPE? Then save that ending char,
change it to NUL, call memparse_safe(), then restore the saved char?

I'm hoping that the documentation is not correct...

> + * The base is determined automatically, if it starts with "0x"
> + * the base is 16, if it starts with "0" the base is 8, otherwise
> + * the base is 10.
> + * After a valid number string, there can be at most one
> + * case-insensitive suffix character, specified by the @suffixes
> + * parameter.
> + *
> + * @suffixes: The suffixes which should be handled. Use logical ORed
> + * memparse_suffix enum to indicate the supported suffixes.
> + * The suffixes are case-insensitive, all 2 ^ 10 based.
> + * Supported ones are "KMGPTE".
> + * If one suffix (one of "KMGPTE") is hit but that suffix is
> + * not specified in the @suffxies parameter, it ends the parse
> + * normally, with @retptr pointed to the (unsupported) suffix.
> + * E.g. "68k" with suffxies "M" returns 68 decimal, @retptr
> + * updated to 'k'.
> + *
> + * @res: Where to write the result.
> + *
> + * @retptr: (output) Optional pointer to the next char after parse completes.
> + *
> + * Returns:
> + * * %0 if any valid numeric string can be parsed, and @retptr is updated.
> + * * %-EINVAL if no valid number string can be found.
> + * * %-ERANGE if the number overflows.
> + * * For negative return values, @retptr is not updated.
> + */
> +noinline int memparse_safe(const char *s, enum memparse_suffix suffixes,
> + unsigned long long *res, char **retptr)
> +{

Thanks.
--
#Randy

2024-01-15 05:28:02

by Qu Wenruo

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] kstrtox: introduce a safer version of memparse()



On 2024/1/15 14:57, Randy Dunlap wrote:
[...]
>> @@ -113,6 +113,105 @@ static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res)
>> return 0;
>> }
>>
>> +/**
>> + * memparse_safe - convert a string to an unsigned long long, safer version of
>> + * memparse()
>> + *
>> + * @s: The start of the string. Must be null-terminated.
>
> Unless I misunderstand, this is the biggest problem that I see with
> memparse_safe(): "Must be null-terminated".
> memparse() does not have that requirement.

This is just an extra safety requirement.

In reality, memparse_safe() would end at the either the first
unsupported suffix after the valid numeric string (including '\0'),
or won't be updated if any error is hit (either no valid string at all,
or some overflow happened).

For most if not all call sites, the string passed in is already
null-terminated.

>
> And how is @retptr updated if the string is null-terminated?

E.g "123456G\0", in this case if suffix "G" is allowed, then @retptr
would be updated to '\0'.

Or another example "123456\0", @retptr would still be updated to '\0'.

>
> If the "Must be null-terminated." is correct, it requires that every user/caller
> first determine the end of the number (how? space and/or any special character
> or any alphabetic character that is not in KMGTPE? Then save that ending char,
> change it to NUL, call memparse_safe(), then restore the saved char?

There are already test cases like "86k \0" (note all strings in the test
case is all null terminated), which would lead to a success parse, with
@retptr updated to ' ' (if suffix K is specified) or 'k' (if suffix K is
not specified).

So the behavior is still the same.
It may be my expression too confusing.

Any recommendation for the comments?

Thanks,
Qu

>
> I'm hoping that the documentation is not correct...
>
>> + * The base is determined automatically, if it starts with "0x"
>> + * the base is 16, if it starts with "0" the base is 8, otherwise
>> + * the base is 10.
>> + * After a valid number string, there can be at most one
>> + * case-insensitive suffix character, specified by the @suffixes
>> + * parameter.
>> + *
>> + * @suffixes: The suffixes which should be handled. Use logical ORed
>> + * memparse_suffix enum to indicate the supported suffixes.
>> + * The suffixes are case-insensitive, all 2 ^ 10 based.
>> + * Supported ones are "KMGPTE".
>> + * If one suffix (one of "KMGPTE") is hit but that suffix is
>> + * not specified in the @suffxies parameter, it ends the parse
>> + * normally, with @retptr pointed to the (unsupported) suffix.
>> + * E.g. "68k" with suffxies "M" returns 68 decimal, @retptr
>> + * updated to 'k'.
>> + *
>> + * @res: Where to write the result.
>> + *
>> + * @retptr: (output) Optional pointer to the next char after parse completes.
>> + *
>> + * Returns:
>> + * * %0 if any valid numeric string can be parsed, and @retptr is updated.
>> + * * %-EINVAL if no valid number string can be found.
>> + * * %-ERANGE if the number overflows.
>> + * * For negative return values, @retptr is not updated.
>> + */
>> +noinline int memparse_safe(const char *s, enum memparse_suffix suffixes,
>> + unsigned long long *res, char **retptr)
>> +{
>
> Thanks.

2024-01-18 21:30:30

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] kstrtox: introduce a safer version of memparse()

Hi,

On 1/14/24 21:27, Qu Wenruo wrote:
>
>
> On 2024/1/15 14:57, Randy Dunlap wrote:
> [...]
>>> @@ -113,6 +113,105 @@ static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res)
>>>       return 0;
>>>   }
>>>
>>> +/**
>>> + * memparse_safe - convert a string to an unsigned long long, safer version of
>>> + * memparse()
>>> + *
>>> + * @s:        The start of the string. Must be null-terminated.
>>
>> Unless I misunderstand, this is the biggest problem that I see with
>> memparse_safe(): "Must be null-terminated".
>> memparse() does not have that requirement.
>
> This is just an extra safety requirement.
>
> In reality, memparse_safe() would end at the either the first
> unsupported suffix after the valid numeric string (including '\0'),
> or won't be updated if any error is hit (either no valid string at all,
> or some overflow happened).
>
> For most if not all call sites, the string passed in is already
> null-terminated.
>
>>
>> And how is @retptr updated if the string is null-terminated?
>
> E.g "123456G\0", in this case if suffix "G" is allowed, then @retptr
> would be updated to '\0'.
>
> Or another example "123456\0", @retptr would still be updated to '\0'.
>
>>
>> If the "Must be null-terminated." is correct, it requires that every user/caller
>> first determine the end of the number (how? space and/or any special character
>> or any alphabetic character that is not in KMGTPE? Then save that ending char,
>> change it to NUL, call memparse_safe(), then restore the saved char?
>
> There are already test cases like "86k \0" (note all strings in the test
> case is all null terminated), which would lead to a success parse, with
> @retptr updated to ' ' (if suffix K is specified) or 'k' (if suffix K is
> not specified).
>
> So the behavior is still the same.
> It may be my expression too confusing.
>
> Any recommendation for the comments?

Well, "Must be null-terminated." is incorrect, so explain better where
the numeric conversion ends.

Thanks.

>
> Thanks,
> Qu
>
>>
>> I'm hoping that the documentation is not correct...
>>
>>> + *        The base is determined automatically, if it starts with "0x"
>>> + *        the base is 16, if it starts with "0" the base is 8, otherwise
>>> + *        the base is 10.
>>> + *        After a valid number string, there can be at most one
>>> + *        case-insensitive suffix character, specified by the @suffixes
>>> + *        parameter.
>>> + *
>>> + * @suffixes:    The suffixes which should be handled. Use logical ORed
>>> + *        memparse_suffix enum to indicate the supported suffixes.
>>> + *        The suffixes are case-insensitive, all 2 ^ 10 based.
>>> + *        Supported ones are "KMGPTE".
>>> + *        If one suffix (one of "KMGPTE") is hit but that suffix is
>>> + *        not specified in the @suffxies parameter, it ends the parse
>>> + *        normally, with @retptr pointed to the (unsupported) suffix.
>>> + *        E.g. "68k" with suffxies "M" returns 68 decimal, @retptr
>>> + *        updated to 'k'.
>>> + *
>>> + * @res:    Where to write the result.
>>> + *
>>> + * @retptr:    (output) Optional pointer to the next char after parse completes.
>>> + *
>>> + * Returns:
>>> + * * %0 if any valid numeric string can be parsed, and @retptr is updated.
>>> + * * %-EINVAL if no valid number string can be found.
>>> + * * %-ERANGE if the number overflows.
>>> + * * For negative return values, @retptr is not updated.
>>> + */
>>> +noinline int memparse_safe(const char *s, enum memparse_suffix suffixes,
>>> +               unsigned long long *res, char **retptr)
>>> +{
>>
>> Thanks.
>

--
#Randy