2023-06-10 20:43:44

by Demi Marie Obenour

[permalink] [raw]
Subject: [PATCH v3 0/4] Make sscanf() stricter

Roger Pau Monné suggested making xenbus_scanf() stricter instead of
using a custom parser. Christoph Hellwig asked why the normal vsscanf()
cannot be made stricter. Richard Weinberger mentioned Linus Torvalds’s
suggestion of using ! to allow overflow.

Changes since v2:

- Better commit messages.
- Fix a compile error in simple_strtoll() (found by 0day bot).
- Fix an uninitialized variable (found by Dan Carpenter).

Changes since v1:

- Better commit messages.
- Use ! to explicitly opt-in to allowing overflow.
- Treat overflow as a conversion failure instead of returning ERANGE.
- Drop the first patch (removal of simple_strtoll()) as it breaks
bcache.
- Stop skipping spaces in vsscanf() instead of adding a separate
vsscanf_strict() function.

Demi Marie Obenour (4):
limits.h: add UCHAR_MAX, SCHAR_MAX, and SCHAR_MIN
vsscanf(): Integer overflow is a conversion failure
vsscanf(): do not skip spaces
Reject NUL bytes in xenstore nodes

.../hive_isp_css_include/platform_support.h | 1 -
drivers/xen/xenbus/xenbus_xs.c | 17 +++-
include/linux/limits.h | 1 +
include/linux/mfd/wl1273-core.h | 3 -
include/vdso/limits.h | 3 +
lib/vsprintf.c | 98 +++++++++++++------
6 files changed, 86 insertions(+), 37 deletions(-)

--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab



2023-06-10 20:53:08

by Demi Marie Obenour

[permalink] [raw]
Subject: [PATCH v3 3/4] vsscanf(): do not skip spaces

Passing spaces before e.g. an integer is usually
not intended. This was suggested by Christoph in
https://lore.kernel.org/lkml/[email protected]/.

Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Demi Marie Obenour <[email protected]>
---
lib/vsprintf.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 9e53355c35b1d6260631868228ede1d178fe3325..665f6197f8313d653f67d7886b12c43942e058dd 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -3551,8 +3551,6 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
char *s = (char *)va_arg(args, char *);
if (field_width == -1)
field_width = SHRT_MAX;
- /* first, skip leading white space in buffer */
- str = skip_spaces(str);

/* now copy until next white space */
while (*str && !isspace(*str) && field_width--)
@@ -3639,11 +3637,7 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
return num;
}

- /* have some sort of integer conversion.
- * first, skip white space in buffer.
- */
- str = skip_spaces(str);
-
+ /* have some sort of integer conversion. */
digit = *str;
if (is_sign && digit == '-') {
if (field_width == 1)
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


2023-06-10 20:53:24

by Demi Marie Obenour

[permalink] [raw]
Subject: [PATCH v3 1/4] limits.h: add UCHAR_MAX, SCHAR_MAX, and SCHAR_MIN

Some drivers already defined these, and they will be used by sscanf()
for overflow checks later. Also add SSIZE_MIN to limits.h, which will
also be needed later.

Signed-off-by: Demi Marie Obenour <[email protected]>
---
.../media/atomisp/pci/hive_isp_css_include/platform_support.h | 1 -
include/linux/limits.h | 1 +
include/linux/mfd/wl1273-core.h | 3 ---
include/vdso/limits.h | 3 +++
4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/media/atomisp/pci/hive_isp_css_include/platform_support.h b/drivers/staging/media/atomisp/pci/hive_isp_css_include/platform_support.h
index 0cdef4a5e8b1bed9884133f1a0b9d853d59d43a4..e29b96d8bebf14839f6dd48fdc6c0f8b029ef31d 100644
--- a/drivers/staging/media/atomisp/pci/hive_isp_css_include/platform_support.h
+++ b/drivers/staging/media/atomisp/pci/hive_isp_css_include/platform_support.h
@@ -27,7 +27,6 @@

#define UINT16_MAX USHRT_MAX
#define UINT32_MAX UINT_MAX
-#define UCHAR_MAX (255)

#define CSS_ALIGN(d, a) d __attribute__((aligned(a)))

diff --git a/include/linux/limits.h b/include/linux/limits.h
index f6bcc936901071f496e3e85bb6e1d93905b12e32..8f7fd85b41fb46e6992d9e5912da00424119227a 100644
--- a/include/linux/limits.h
+++ b/include/linux/limits.h
@@ -8,6 +8,7 @@

#define SIZE_MAX (~(size_t)0)
#define SSIZE_MAX ((ssize_t)(SIZE_MAX >> 1))
+#define SSIZE_MIN (-SSIZE_MAX - 1)
#define PHYS_ADDR_MAX (~(phys_addr_t)0)

#define U8_MAX ((u8)~0U)
diff --git a/include/linux/mfd/wl1273-core.h b/include/linux/mfd/wl1273-core.h
index c28cf76d5c31ee1c94a9319a2e2d318bf00283a6..b81a229135ed9f756c749122a8341816031c8311 100644
--- a/include/linux/mfd/wl1273-core.h
+++ b/include/linux/mfd/wl1273-core.h
@@ -204,9 +204,6 @@
WL1273_IS2_TRI_OPT | \
WL1273_IS2_RATE_48K)

-#define SCHAR_MIN (-128)
-#define SCHAR_MAX 127
-
#define WL1273_FR_EVENT BIT(0)
#define WL1273_BL_EVENT BIT(1)
#define WL1273_RDS_EVENT BIT(2)
diff --git a/include/vdso/limits.h b/include/vdso/limits.h
index 0197888ad0e00b2f853d3f25ffa764f61cca7385..0cad0a2490e5efc194d874025eb3e3b846a5c7b4 100644
--- a/include/vdso/limits.h
+++ b/include/vdso/limits.h
@@ -2,6 +2,9 @@
#ifndef __VDSO_LIMITS_H
#define __VDSO_LIMITS_H

+#define UCHAR_MAX ((unsigned char)~0U)
+#define SCHAR_MAX ((signed char)(UCHAR_MAX >> 1))
+#define SCHAR_MIN ((signed char)(-SCHAR_MAX - 1))
#define USHRT_MAX ((unsigned short)~0U)
#define SHRT_MAX ((short)(USHRT_MAX >> 1))
#define SHRT_MIN ((short)(-SHRT_MAX - 1))
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


2023-06-10 20:54:17

by Demi Marie Obenour

[permalink] [raw]
Subject: [PATCH v3 2/4] vsscanf(): Integer overflow is a conversion failure

sscanf() and friends currently ignore integer overflow, but this is a
bad idea. It is much better to detect integer overflow errors and
consider this a conversion failure.

This implements Linus's suggestion of using '!' to treat integer
overflow as wrapping. It does _not_ allow wrapping of unsigned
conversions by default, though, as in at least some cases accepting
negative numbers is _not_ intended.

Suggested-By: Linus Torvalds <[email protected]>
Signed-off-by: Demi Marie Obenour <[email protected]>
---
lib/vsprintf.c | 90 ++++++++++++++++++++++++++++++++++++++------------
1 file changed, 69 insertions(+), 21 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 40f560959b169b4c4ac6154d658cfe76cfd0c5a6..9e53355c35b1d6260631868228ede1d178fe3325 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -59,7 +59,7 @@
bool no_hash_pointers __ro_after_init;
EXPORT_SYMBOL_GPL(no_hash_pointers);

-static noinline unsigned long long simple_strntoull(const char *startp, size_t max_chars, char **endp, unsigned int base)
+static noinline unsigned long long simple_strntoull(const char *startp, size_t max_chars, char **endp, unsigned int base, bool *overflow)
{
const char *cp;
unsigned long long result = 0ULL;
@@ -70,11 +70,13 @@ static noinline unsigned long long simple_strntoull(const char *startp, size_t m
prefix_chars = cp - startp;
if (prefix_chars < max_chars) {
rv = _parse_integer_limit(cp, base, &result, max_chars - prefix_chars);
- /* FIXME */
+ *overflow = !!(rv & KSTRTOX_OVERFLOW);
cp += (rv & ~KSTRTOX_OVERFLOW);
} else {
/* Field too short for prefix + digit, skip over without converting */
cp = startp + max_chars;
+ /* Treat this as a conversion failure */
+ *overflow = true;
}

if (endp)
@@ -94,7 +96,9 @@ static noinline unsigned long long simple_strntoull(const char *startp, size_t m
noinline
unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base)
{
- return simple_strntoull(cp, INT_MAX, endp, base);
+ bool _placeholder;
+
+ return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);
}
EXPORT_SYMBOL(simple_strtoull);

@@ -130,18 +134,22 @@ long simple_strtol(const char *cp, char **endp, unsigned int base)
EXPORT_SYMBOL(simple_strtol);

static long long simple_strntoll(const char *cp, size_t max_chars, char **endp,
- unsigned int base)
+ unsigned int base, bool *overflow)
{
+ unsigned long long minand;
+ bool negate;
+
/*
* simple_strntoull() safely handles receiving max_chars==0 in the
* case cp[0] == '-' && max_chars == 1.
* If max_chars == 0 we can drop through and pass it to simple_strntoull()
* and the content of *cp is irrelevant.
*/
- if (*cp == '-' && max_chars > 0)
- return -simple_strntoull(cp + 1, max_chars - 1, endp, base);
-
- return simple_strntoull(cp, max_chars, endp, base);
+ negate = *cp == '-' && max_chars > 0;
+ minand = simple_strntoull(cp + negate, max_chars - negate, endp, base, overflow);
+ if (minand > (unsigned long long)LONG_MAX + negate)
+ *overflow = true;
+ return negate ? -minand : minand;
}

/**
@@ -154,7 +162,9 @@ static long long simple_strntoll(const char *cp, size_t max_chars, char **endp,
*/
long long simple_strtoll(const char *cp, char **endp, unsigned int base)
{
- return simple_strntoll(cp, INT_MAX, endp, base);
+ bool _placeholder;
+
+ return simple_strntoll(cp, INT_MAX, endp, base, &_placeholder);
}
EXPORT_SYMBOL(simple_strtoll);

@@ -3441,7 +3451,7 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
unsigned long long u;
} val;
s16 field_width;
- bool is_sign;
+ bool is_sign, overflow, allow_overflow;

while (*fmt) {
/* skip any white space in format */
@@ -3464,6 +3474,9 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
break;
++fmt;

+ allow_overflow = *fmt == '!';
+ fmt += (int)allow_overflow;
+
/* skip this conversion.
* advance both strings to next white space
*/
@@ -3649,45 +3662,80 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
if (is_sign)
val.s = simple_strntoll(str,
field_width >= 0 ? field_width : INT_MAX,
- &next, base);
+ &next, base, &overflow);
else
val.u = simple_strntoull(str,
field_width >= 0 ? field_width : INT_MAX,
- &next, base);
+ &next, base, &overflow);
+ if (unlikely(overflow) && !allow_overflow)
+ break;

switch (qualifier) {
case 'H': /* that's 'hh' in format */
- if (is_sign)
+ if (is_sign) {
+ if (unlikely(val.s < SCHAR_MIN || val.s > SCHAR_MAX) &&
+ !allow_overflow)
+ return num;
*va_arg(args, signed char *) = val.s;
- else
+ } else {
+ if (unlikely(val.u > UCHAR_MAX) && !allow_overflow)
+ return num;
*va_arg(args, unsigned char *) = val.u;
+ }
break;
case 'h':
- if (is_sign)
+ if (is_sign) {
+ if (unlikely(val.s < SHRT_MIN || val.s > SHRT_MAX) &&
+ !allow_overflow)
+ return num;
*va_arg(args, short *) = val.s;
- else
+ } else {
+ if (unlikely(val.u > USHRT_MAX) && !allow_overflow)
+ return num;
*va_arg(args, unsigned short *) = val.u;
+ }
break;
case 'l':
- if (is_sign)
+ if (is_sign) {
+ if (unlikely(val.s < LONG_MIN || val.s > LONG_MAX) &&
+ !allow_overflow)
+ return num;
*va_arg(args, long *) = val.s;
- else
+ } else {
+ if (unlikely(val.u > ULONG_MAX) && !allow_overflow)
+ return num;
*va_arg(args, unsigned long *) = val.u;
+ }
break;
case 'L':
+ /* No overflow check needed */
if (is_sign)
*va_arg(args, long long *) = val.s;
else
*va_arg(args, unsigned long long *) = val.u;
break;
case 'z':
- *va_arg(args, size_t *) = val.u;
+ if (is_sign) {
+ if (unlikely(val.s < SSIZE_MIN || val.s > SSIZE_MAX))
+ return num;
+ *va_arg(args, ssize_t *) = val.s;
+ } else {
+ if (unlikely(val.u > SIZE_MAX) && !allow_overflow)
+ return num;
+ *va_arg(args, size_t *) = val.u;
+ }
break;
default:
- if (is_sign)
+ if (is_sign) {
+ if (unlikely(val.s < INT_MIN || val.s > INT_MAX) &&
+ !allow_overflow)
+ return num;
*va_arg(args, int *) = val.s;
- else
+ } else {
+ if (unlikely(val.u > UINT_MAX) && !allow_overflow)
+ return num;
*va_arg(args, unsigned int *) = val.u;
+ }
break;
}
num++;
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


2023-06-10 21:08:43

by Demi Marie Obenour

[permalink] [raw]
Subject: [PATCH v3 4/4] Reject NUL bytes in xenstore nodes

This rejects bogus xenstore node values that include interior NUL
bytes. These would be truncated by functions that expect NUL-terminated
strings.

Signed-off-by: Demi Marie Obenour <[email protected]>
---
drivers/xen/xenbus/xenbus_xs.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
index 12e02eb01f5991b31db451cc57037205359b347f..7cb2a22a7698ac40c81add23476594d9f27de8d0 100644
--- a/drivers/xen/xenbus/xenbus_xs.c
+++ b/drivers/xen/xenbus/xenbus_xs.c
@@ -569,16 +569,20 @@ int xenbus_scanf(struct xenbus_transaction t,
const char *dir, const char *node, const char *fmt, ...)
{
va_list ap;
- int ret;
+ int ret = 0;
+ unsigned int len;
char *val;

- val = xenbus_read(t, dir, node, NULL);
+ val = xenbus_read(t, dir, node, &len);
if (IS_ERR(val))
return PTR_ERR(val);
+ if (strlen(val) != len)
+ goto bad;

va_start(ap, fmt);
ret = vsscanf(val, fmt, ap);
va_end(ap);
+bad:
kfree(val);
/* Distinctive errno. */
if (ret == 0)
@@ -636,15 +640,18 @@ int xenbus_gather(struct xenbus_transaction t, const char *dir, ...)
while (ret == 0 && (name = va_arg(ap, char *)) != NULL) {
const char *fmt = va_arg(ap, char *);
void *result = va_arg(ap, void *);
+ unsigned len;
char *p;

- p = xenbus_read(t, dir, name, NULL);
+ p = xenbus_read(t, dir, name, &len);
if (IS_ERR(p)) {
ret = PTR_ERR(p);
break;
}
- if (fmt) {
- if (sscanf(p, fmt, result) == 0)
+ if (strlen(p) != len)
+ ret = -EINVAL;
+ else if (fmt) {
+ if (sscanf(p, fmt, result) <= 0)
ret = -EINVAL;
kfree(p);
} else
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


2023-06-12 11:37:52

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] vsscanf(): do not skip spaces

On 10/06/2023 22.40, Demi Marie Obenour wrote:
> Passing spaces before e.g. an integer is usually
> not intended.

Maybe, maybe not. But it's mandated by POSIX/C99.

And of course we are free to ignore that and implement our own semantics
(though within the constraints that we really want -Wformat to help us),
but there seems to be existing code in-tree that relies on this
behavior. For example I think this will break
fsl_sata_intr_coalescing_store() which uses a scanf format of "%u%u".

Sure, that could just say "%u %u" instead, but the point is that
currently it doesn't. So without some reasonably thorough analysis
across the tree, and updates of affected callers, NAK.

Rasmus


2023-06-12 11:39:02

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH v3 2/4] vsscanf(): Integer overflow is a conversion failure

On 10/06/2023 22.40, Demi Marie Obenour wrote:
> sscanf() and friends currently ignore integer overflow, but this is a
> bad idea. It is much better to detect integer overflow errors and
> consider this a conversion failure.

Perhaps. And maybe I even agree. But not like this:

> while (*fmt) {
> /* skip any white space in format */
> @@ -3464,6 +3474,9 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
> break;
> ++fmt;
>
> + allow_overflow = *fmt == '!';
> + fmt += (int)allow_overflow;
> +

You can't do that. Or, at least, you won't be able to actually use %!d
anywhere, because the compiler will yell at you:

lib/vsprintf.c: In function ‘foobar’:
lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in
format [-Werror=format=]
3727 | ret = sscanf("12345", "%!d", &val);
| ^

So NAK.

Also, when you make significant changes to the sscanf implementation,
I'd expect the diffstat for the patch series to contain lib/test_scanf.c.

Rasmus

2023-06-12 11:41:08

by Lee Jones

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] limits.h: add UCHAR_MAX, SCHAR_MAX, and SCHAR_MIN

On Sat, 10 Jun 2023, Demi Marie Obenour wrote:

> Some drivers already defined these, and they will be used by sscanf()
> for overflow checks later. Also add SSIZE_MIN to limits.h, which will
> also be needed later.
>
> Signed-off-by: Demi Marie Obenour <[email protected]>
> ---
> .../media/atomisp/pci/hive_isp_css_include/platform_support.h | 1 -
> include/linux/limits.h | 1 +
> include/linux/mfd/wl1273-core.h | 3 ---

Acked-by: Lee Jones <[email protected]>

> include/vdso/limits.h | 3 +++
> 4 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/staging/media/atomisp/pci/hive_isp_css_include/platform_support.h b/drivers/staging/media/atomisp/pci/hive_isp_css_include/platform_support.h
> index 0cdef4a5e8b1bed9884133f1a0b9d853d59d43a4..e29b96d8bebf14839f6dd48fdc6c0f8b029ef31d 100644
> --- a/drivers/staging/media/atomisp/pci/hive_isp_css_include/platform_support.h
> +++ b/drivers/staging/media/atomisp/pci/hive_isp_css_include/platform_support.h
> @@ -27,7 +27,6 @@
>
> #define UINT16_MAX USHRT_MAX
> #define UINT32_MAX UINT_MAX
> -#define UCHAR_MAX (255)
>
> #define CSS_ALIGN(d, a) d __attribute__((aligned(a)))
>
> diff --git a/include/linux/limits.h b/include/linux/limits.h
> index f6bcc936901071f496e3e85bb6e1d93905b12e32..8f7fd85b41fb46e6992d9e5912da00424119227a 100644
> --- a/include/linux/limits.h
> +++ b/include/linux/limits.h
> @@ -8,6 +8,7 @@
>
> #define SIZE_MAX (~(size_t)0)
> #define SSIZE_MAX ((ssize_t)(SIZE_MAX >> 1))
> +#define SSIZE_MIN (-SSIZE_MAX - 1)
> #define PHYS_ADDR_MAX (~(phys_addr_t)0)
>
> #define U8_MAX ((u8)~0U)
> diff --git a/include/linux/mfd/wl1273-core.h b/include/linux/mfd/wl1273-core.h
> index c28cf76d5c31ee1c94a9319a2e2d318bf00283a6..b81a229135ed9f756c749122a8341816031c8311 100644
> --- a/include/linux/mfd/wl1273-core.h
> +++ b/include/linux/mfd/wl1273-core.h
> @@ -204,9 +204,6 @@
> WL1273_IS2_TRI_OPT | \
> WL1273_IS2_RATE_48K)
>
> -#define SCHAR_MIN (-128)
> -#define SCHAR_MAX 127
> -
> #define WL1273_FR_EVENT BIT(0)
> #define WL1273_BL_EVENT BIT(1)
> #define WL1273_RDS_EVENT BIT(2)
> diff --git a/include/vdso/limits.h b/include/vdso/limits.h
> index 0197888ad0e00b2f853d3f25ffa764f61cca7385..0cad0a2490e5efc194d874025eb3e3b846a5c7b4 100644
> --- a/include/vdso/limits.h
> +++ b/include/vdso/limits.h
> @@ -2,6 +2,9 @@
> #ifndef __VDSO_LIMITS_H
> #define __VDSO_LIMITS_H
>
> +#define UCHAR_MAX ((unsigned char)~0U)
> +#define SCHAR_MAX ((signed char)(UCHAR_MAX >> 1))
> +#define SCHAR_MIN ((signed char)(-SCHAR_MAX - 1))
> #define USHRT_MAX ((unsigned short)~0U)
> #define SHRT_MAX ((short)(USHRT_MAX >> 1))
> #define SHRT_MIN ((short)(-SHRT_MAX - 1))
> --
> Sincerely,
> Demi Marie Obenour (she/her/hers)
> Invisible Things Lab
>

--
Lee Jones [李琼斯]

2023-06-12 11:53:07

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v3 3/4] vsscanf(): do not skip spaces

From: Demi Marie Obenour
> Sent: 10 June 2023 21:41
>
> Passing spaces before e.g. an integer is usually
> not intended. This was suggested by Christoph in
> https://lore.kernel.org/lkml/[email protected]/.

This is contrary to libc scanf and could easily affect userspace
writing fixed width values into sysctl nodes (etc).

IIRC strtoul() (etc) are also expected to strip leading spaces.
Removing the sign in sscanf() may lead to "- 12" being
valid - which may not be desired.

David

>
> Suggested-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Demi Marie Obenour <[email protected]>
> ---
> lib/vsprintf.c | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index 9e53355c35b1d6260631868228ede1d178fe3325..665f6197f8313d653f67d7886b12c43942e058dd 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -3551,8 +3551,6 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
> char *s = (char *)va_arg(args, char *);
> if (field_width == -1)
> field_width = SHRT_MAX;
> - /* first, skip leading white space in buffer */
> - str = skip_spaces(str);
>
> /* now copy until next white space */
> while (*str && !isspace(*str) && field_width--)
> @@ -3639,11 +3637,7 @@ int vsscanf(const char *buf, const char *fmt, va_list args)
> return num;
> }
>
> - /* have some sort of integer conversion.
> - * first, skip white space in buffer.
> - */
> - str = skip_spaces(str);
> -
> + /* have some sort of integer conversion. */
> digit = *str;
> if (is_sign && digit == '-') {
> if (field_width == 1)
> --
> Sincerely,
> Demi Marie Obenour (she/her/hers)
> Invisible Things Lab

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


2023-06-12 12:04:47

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

> + bool _placeholder;
> + return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);

This can be done without introducing dummy variables:

void f(bool *b)
{
}

f((bool[1]){});

> > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> So NAK.

Yeah, ! should go after format specifier like it does for %p.

2023-06-12 15:54:54

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Sat, Jun 10, 2023 at 04:40:40PM -0400, Demi Marie Obenour wrote:
> Roger Pau Monné suggested making xenbus_scanf() stricter instead of
> using a custom parser. Christoph Hellwig asked why the normal vsscanf()
> cannot be made stricter. Richard Weinberger mentioned Linus Torvalds’s
> suggestion of using ! to allow overflow.

As Rasmus articulated, NAK w.o. test cases being added to all parts where your
changes touch.

--
With Best Regards,
Andy Shevchenko



2023-06-12 16:57:37

by Vincenzo Frascino

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] limits.h: add UCHAR_MAX, SCHAR_MAX, and SCHAR_MIN

Hi Demi,

On 6/10/23 21:40, Demi Marie Obenour wrote:
> Some drivers already defined these, and they will be used by sscanf()
> for overflow checks later. Also add SSIZE_MIN to limits.h, which will
> also be needed later.
>
> Signed-off-by: Demi Marie Obenour <[email protected]>
> ---
> .../media/atomisp/pci/hive_isp_css_include/platform_support.h | 1 -
> include/linux/limits.h | 1 +
> include/linux/mfd/wl1273-core.h | 3 ---
> include/vdso/limits.h | 3 +++
> 4 files changed, 4 insertions(+), 4 deletions(-)
>
...

> diff --git a/include/vdso/limits.h b/include/vdso/limits.h
> index 0197888ad0e00b2f853d3f25ffa764f61cca7385..0cad0a2490e5efc194d874025eb3e3b846a5c7b4 100644
> --- a/include/vdso/limits.h
> +++ b/include/vdso/limits.h
> @@ -2,6 +2,9 @@
> #ifndef __VDSO_LIMITS_H
> #define __VDSO_LIMITS_H
>
> +#define UCHAR_MAX ((unsigned char)~0U)
> +#define SCHAR_MAX ((signed char)(UCHAR_MAX >> 1))
> +#define SCHAR_MIN ((signed char)(-SCHAR_MAX - 1))

Are you planning to use those definitions in the vDSO library?

If not can you please define them in linux/limits.h, the vdso headers contain
only what is necessary for the vDSO library.

Thanks!

> #define USHRT_MAX ((unsigned short)~0U)
> #define SHRT_MAX ((short)(USHRT_MAX >> 1))
> #define SHRT_MIN ((short)(-SHRT_MAX - 1))

--
Regards,
Vincenzo

2023-06-12 20:35:58

by Demi Marie Obenour

[permalink] [raw]
Subject: Re: [PATCH v3 1/4] limits.h: add UCHAR_MAX, SCHAR_MAX, and SCHAR_MIN

On Mon, Jun 12, 2023 at 05:31:51PM +0100, Vincenzo Frascino wrote:
> Hi Demi,
>
> On 6/10/23 21:40, Demi Marie Obenour wrote:
> > Some drivers already defined these, and they will be used by sscanf()
> > for overflow checks later. Also add SSIZE_MIN to limits.h, which will
> > also be needed later.
> >
> > Signed-off-by: Demi Marie Obenour <[email protected]>
> > ---
> > .../media/atomisp/pci/hive_isp_css_include/platform_support.h | 1 -
> > include/linux/limits.h | 1 +
> > include/linux/mfd/wl1273-core.h | 3 ---
> > include/vdso/limits.h | 3 +++
> > 4 files changed, 4 insertions(+), 4 deletions(-)
> >
> ...
>
> > diff --git a/include/vdso/limits.h b/include/vdso/limits.h
> > index 0197888ad0e00b2f853d3f25ffa764f61cca7385..0cad0a2490e5efc194d874025eb3e3b846a5c7b4 100644
> > --- a/include/vdso/limits.h
> > +++ b/include/vdso/limits.h
> > @@ -2,6 +2,9 @@
> > #ifndef __VDSO_LIMITS_H
> > #define __VDSO_LIMITS_H
> >
> > +#define UCHAR_MAX ((unsigned char)~0U)
> > +#define SCHAR_MAX ((signed char)(UCHAR_MAX >> 1))
> > +#define SCHAR_MIN ((signed char)(-SCHAR_MAX - 1))
>
> Are you planning to use those definitions in the vDSO library?

Nope. They were added here for consistency with the other *_{MIN,MAX}
defines.

> If not can you please define them in linux/limits.h, the vdso headers contain
> only what is necessary for the vDSO library.

Will fix in the next version.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


Attachments:
(No filename) (1.60 kB)
signature.asc (849.00 B)
Download all attachments

2023-06-12 20:37:04

by Demi Marie Obenour

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Mon, Jun 12, 2023 at 02:59:38PM +0300, Alexey Dobriyan wrote:
> > + bool _placeholder;
> > + return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);
>
> This can be done without introducing dummy variables:
>
> void f(bool *b)
> {
> }
>
> f((bool[1]){});

This is more consise, but (at least to me) significantly less readable.

> > > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> > So NAK.
>
> Yeah, ! should go after format specifier like it does for %p.

I hadn't considered that. Is the typical approach in Linux to use e.g.
%d%[!] if one wants a literal '!'?
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


Attachments:
(No filename) (737.00 B)
signature.asc (849.00 B)
Download all attachments

2023-06-12 21:09:59

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Mon, Jun 12, 2023 at 04:25:01PM -0400, Demi Marie Obenour wrote:
> On Mon, Jun 12, 2023 at 02:59:38PM +0300, Alexey Dobriyan wrote:
> > > + bool _placeholder;
> > > + return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);
> >
> > This can be done without introducing dummy variables:
> >
> > void f(bool *b)
> > {
> > }
> >
> > f((bool[1]){});
>
> This is more consise, but (at least to me) significantly less readable.
>
> > > > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> > > So NAK.
> >
> > Yeah, ! should go after format specifier like it does for %p.
>
> I hadn't considered that. Is the typical approach in Linux to use e.g.
> %d%[!] if one wants a literal '!'?

It might be that the cleanest way we have is to create %p-like extensions to
sscanf(). %p takes alnum as parameter and that is usually works since it makes
a little sense to attach alnum suffix to the pointer.

(I don't like to have %dX, where X is alnum as we expanding our hack to
something which people don't expect to be altered even in the kernelm, you may
refer to the discussion about %de for printing errors)


--
With Best Regards,
Andy Shevchenko



2023-06-12 21:55:30

by Demi Marie Obenour

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Tue, Jun 13, 2023 at 12:00:44AM +0300, Andy Shevchenko wrote:
> On Mon, Jun 12, 2023 at 04:25:01PM -0400, Demi Marie Obenour wrote:
> > On Mon, Jun 12, 2023 at 02:59:38PM +0300, Alexey Dobriyan wrote:
> > > > + bool _placeholder;
> > > > + return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);
> > >
> > > This can be done without introducing dummy variables:
> > >
> > > void f(bool *b)
> > > {
> > > }
> > >
> > > f((bool[1]){});
> >
> > This is more consise, but (at least to me) significantly less readable.
> >
> > > > > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> > > > So NAK.
> > >
> > > Yeah, ! should go after format specifier like it does for %p.
> >
> > I hadn't considered that. Is the typical approach in Linux to use e.g.
> > %d%[!] if one wants a literal '!'?
>
> It might be that the cleanest way we have is to create %p-like extensions to
> sscanf(). %p takes alnum as parameter and that is usually works since it makes
> a little sense to attach alnum suffix to the pointer.
>
> (I don't like to have %dX, where X is alnum as we expanding our hack to
> something which people don't expect to be altered even in the kernelm, you may
> refer to the discussion about %de for printing errors)

Personally I’m not too worried about compatibility with userspace
sscanf(), except to the extent that -Werror=format can keep working.
Userspace sscanf() is almost useless: it has undefined behavior on
integer overflow and swallows spaces that should usually be rejected.
I typically either use strto*l() or (as I am currently doing for Xen’s
toolstack) just write my own parsing functions from scratch.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


Attachments:
(No filename) (1.78 kB)
signature.asc (849.00 B)
Download all attachments

2023-06-12 22:41:41

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Mon, Jun 12, 2023 at 05:23:18PM -0400, Demi Marie Obenour wrote:
> On Tue, Jun 13, 2023 at 12:00:44AM +0300, Andy Shevchenko wrote:
> > On Mon, Jun 12, 2023 at 04:25:01PM -0400, Demi Marie Obenour wrote:
> > > On Mon, Jun 12, 2023 at 02:59:38PM +0300, Alexey Dobriyan wrote:
> > > > > + bool _placeholder;
> > > > > + return simple_strntoull(cp, INT_MAX, endp, base, &_placeholder);
> > > >
> > > > This can be done without introducing dummy variables:
> > > >
> > > > void f(bool *b)
> > > > {
> > > > }
> > > >
> > > > f((bool[1]){});
> > >
> > > This is more consise, but (at least to me) significantly less readable.
> > >
> > > > > > lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=]
> > > > > So NAK.
> > > >
> > > > Yeah, ! should go after format specifier like it does for %p.
> > >
> > > I hadn't considered that. Is the typical approach in Linux to use e.g.
> > > %d%[!] if one wants a literal '!'?
> >
> > It might be that the cleanest way we have is to create %p-like extensions to
> > sscanf(). %p takes alnum as parameter and that is usually works since it makes
> > a little sense to attach alnum suffix to the pointer.
> >
> > (I don't like to have %dX, where X is alnum as we expanding our hack to
> > something which people don't expect to be altered even in the kernelm, you may
> > refer to the discussion about %de for printing errors)
>
> Personally I’m not too worried about compatibility with userspace
> sscanf(), except to the extent that -Werror=format can keep working.
> Userspace sscanf() is almost useless: it has undefined behavior on
> integer overflow and swallows spaces that should usually be rejected.
> I typically either use strto*l() or (as I am currently doing for Xen’s
> toolstack) just write my own parsing functions from scratch.

`man sscanf` tells about %p, and currently we have no use (if I'm not mistaken)
for %pj in printf(), so that can be used for %pj in sscanf() to avoid ambiguity
with possible extensions to actually parse our %p extension-like strings.

Not sure if others support the idea.

--
With Best Regards,
Andy Shevchenko



2023-06-13 13:14:42

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v3 0/4] Make sscanf() stricter

From: Demi Marie Obenour
> Sent: 12 June 2023 22:23
....
> sscanf(), except to the extent that -Werror=format can keep working.
> Userspace sscanf() is almost useless: it has undefined behavior on
> integer overflow and swallows spaces that should usually be rejected.

scanf() is designed for parsing space separated data.
Eating spaces it part of its job description.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-06-13 13:19:56

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v3 3/4] vsscanf(): do not skip spaces

From: Rasmus Villemoes
> Sent: 12 June 2023 12:08
>
> On 10/06/2023 22.40, Demi Marie Obenour wrote:
> > Passing spaces before e.g. an integer is usually
> > not intended.
>
> Maybe, maybe not. But it's mandated by POSIX/C99.
>
> And of course we are free to ignore that and implement our own semantics
> (though within the constraints that we really want -Wformat to help us),
> but there seems to be existing code in-tree that relies on this
> behavior. For example I think this will break
> fsl_sata_intr_coalescing_store() which uses a scanf format of "%u%u".
>
> Sure, that could just say "%u %u" instead, but the point is that
> currently it doesn't. So without some reasonably thorough analysis
> across the tree, and updates of affected callers, NAK.

It would almost certainly need to be " %u %u" to allow for
userspace generating the input with "%6u %6u",

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-06-13 15:41:38

by Demi Marie Obenour

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Tue, Jun 13, 2023 at 01:02:59PM +0000, David Laight wrote:
> From: Demi Marie Obenour
> > Sent: 12 June 2023 22:23
> ....
> > sscanf(), except to the extent that -Werror=format can keep working.
> > Userspace sscanf() is almost useless: it has undefined behavior on
> > integer overflow and swallows spaces that should usually be rejected.
>
> scanf() is designed for parsing space separated data.
> Eating spaces it part of its job description.
>
> David

In this case I would prefer to have two versions: one that eats spaces
and one that does not. For instance, I don’t think any user of
xenbus_scanf() wants the space-swallowing behavior. This can be worked
around in xenbus_scanf(), of course, by having it reject strings with
spaces (as determened by isspace()) before calling vsscanf().
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


Attachments:
(No filename) (896.00 B)
signature.asc (849.00 B)
Download all attachments

2023-06-14 08:40:39

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v3 0/4] Make sscanf() stricter

From: Demi Marie Obenour
> Sent: 13 June 2023 16:35
>
> On Tue, Jun 13, 2023 at 01:02:59PM +0000, David Laight wrote:
> > From: Demi Marie Obenour
> > > Sent: 12 June 2023 22:23
> > ....
> > > sscanf(), except to the extent that -Werror=format can keep working.
> > > Userspace sscanf() is almost useless: it has undefined behavior on
> > > integer overflow and swallows spaces that should usually be rejected.
> >
> > scanf() is designed for parsing space separated data.
> > Eating spaces it part of its job description.
> >
> > David
>
> In this case I would prefer to have two versions: one that eats spaces
> and one that does not. For instance, I don’t think any user of
> xenbus_scanf() wants the space-swallowing behavior. This can be worked
> around in xenbus_scanf(), of course, by having it reject strings with
> spaces (as determened by isspace()) before calling vsscanf().

What sort of formats and data are being used?
The "%s" format terminates on whitespace.
Even stroul() (and friends) will skip leading whitespace.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-06-14 20:40:49

by Demi Marie Obenour

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Wed, Jun 14, 2023 at 08:23:56AM +0000, David Laight wrote:
> From: Demi Marie Obenour
> > Sent: 13 June 2023 16:35
> >
> > On Tue, Jun 13, 2023 at 01:02:59PM +0000, David Laight wrote:
> > > From: Demi Marie Obenour
> > > > Sent: 12 June 2023 22:23
> > > ....
> > > > sscanf(), except to the extent that -Werror=format can keep working.
> > > > Userspace sscanf() is almost useless: it has undefined behavior on
> > > > integer overflow and swallows spaces that should usually be rejected.
> > >
> > > scanf() is designed for parsing space separated data.
> > > Eating spaces it part of its job description.
> > >
> > > David
> >
> > In this case I would prefer to have two versions: one that eats spaces
> > and one that does not. For instance, I don’t think any user of
> > xenbus_scanf() wants the space-swallowing behavior. This can be worked
> > around in xenbus_scanf(), of course, by having it reject strings with
> > spaces (as determened by isspace()) before calling vsscanf().
>
> What sort of formats and data are being used?

Base-10 or base-16 integers, with whitespace never being valid.

> The "%s" format terminates on whitespace.
> Even stroul() (and friends) will skip leading whitespace.

Yes, which is a reason that strto*l() are just broken IMO. I’m trying
to replace their uses in Xen with custom parsing code.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


Attachments:
(No filename) (1.42 kB)
signature.asc (849.00 B)
Download all attachments

2023-06-15 08:20:52

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v3 0/4] Make sscanf() stricter

From: Demi Marie Obenour
> Sent: 14 June 2023 21:09
....
> > What sort of formats and data are being used?
>
> Base-10 or base-16 integers, with whitespace never being valid.

In which case sscanf() really isn't what you are looking for.

> > The "%s" format terminates on whitespace.
> > Even stroul() (and friends) will skip leading whitespace.
>
> Yes, which is a reason that strto*l() are just broken IMO.

They are not 'broken', that is what is useful most of the time.
The usual problem is that "020" is treated as octal.

> I’m trying to replace their uses in Xen with custom parsing code.

Then write a custom parser :-)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-06-15 12:00:55

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:
> From: Demi Marie Obenour
> > Sent: 14 June 2023 21:09

...

> > > What sort of formats and data are being used?
> >
> > Base-10 or base-16 integers, with whitespace never being valid.
>
> In which case sscanf() really isn't what you are looking for.
>
> > > The "%s" format terminates on whitespace.
> > > Even stroul() (and friends) will skip leading whitespace.
> >
> > Yes, which is a reason that strto*l() are just broken IMO.
>
> They are not 'broken', that is what is useful most of the time.
> The usual problem is that "020" is treated as octal.
>
> > I’m trying to replace their uses in Xen with custom parsing code.
>
> Then write a custom parser :-)

Hmm... Usually we are against zillion implementations of the same with zillion
bugs hidden (each buggy implementation with its own bugs).

--
With Best Regards,
Andy Shevchenko



2023-06-15 12:17:08

by David Laight

[permalink] [raw]
Subject: RE: [PATCH v3 0/4] Make sscanf() stricter

From: Andy Shevchenko <[email protected]>
> Sent: 15 June 2023 12:24
> ...
>
> > > > What sort of formats and data are being used?
> > >
> > > Base-10 or base-16 integers, with whitespace never being valid.
> >
> > In which case sscanf() really isn't what you are looking for.
> >
> > > > The "%s" format terminates on whitespace.
> > > > Even stroul() (and friends) will skip leading whitespace.
> > >
> > > Yes, which is a reason that strto*l() are just broken IMO.
> >
> > They are not 'broken', that is what is useful most of the time.
> > The usual problem is that "020" is treated as octal.
> >
> > > I’m trying to replace their uses in Xen with custom parsing code.
> >
> > Then write a custom parser :-)
>
> Hmm... Usually we are against zillion implementations of the same with zillion
> bugs hidden (each buggy implementation with its own bugs).

Using strtoull() for the actual numeric conversion removes most of that.

I am intrigued about why you want to error leading whitespace.
I bet you'll find something that is passing space-padded input.
It might be against the letter of the spec - but it doesn't mean
it doesn't happen.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-06-20 13:57:56

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:
> > From: Demi Marie Obenour
> > > Sent: 14 June 2023 21:09
>
> ...
>
> > > > What sort of formats and data are being used?
> > >
> > > Base-10 or base-16 integers, with whitespace never being valid.
> >
> > In which case sscanf() really isn't what you are looking for.
> >
> > > > The "%s" format terminates on whitespace.
> > > > Even stroul() (and friends) will skip leading whitespace.
> > >
> > > Yes, which is a reason that strto*l() are just broken IMO.
> >
> > They are not 'broken', that is what is useful most of the time.
> > The usual problem is that "020" is treated as octal.

I do not know how many users depend on this behavior. But I believe
that there are such users. And breaking compatibility with userspace
implementation would make more harm then good in this case.

> > > I’m trying to replace their uses in Xen with custom parsing code.
> >
> > Then write a custom parser :-)

Honestly, I dislike any sscanf() modification which have been suggested
so far:

+ %!d is not acceptable because it produces compiler errors

+ %d! is not acceptable because "use 64!" is a realistic string.
We could not be sure that "<number>!" will never be parsed
in kernel.

+ %d%[!] produces compiler error either. It is hard to parse by eyes.
Also the meaning of such a format would be far from obvious.

+ %pj or another %p modifiers would be hard to understand either.

Yes, we have %pe but I think that only few people really use it.
And it is kind of self-explanatory because it is typically
used together with ERR_PTR() and with variables called
"err" or "ret".


> Hmm... Usually we are against zillion implementations of the same with zillion
> bugs hidden (each buggy implementation with its own bugs).

I would really like to see the code depending on it. The cover letter
suggests that there already is a patch with such a custom parser.
I am sorry if it has already been mentioned. There were so many threads.

Sure, we do not want two full featured sscanf() implementations. But a
wrapper checking for leading whitespace and using kstrto<foo>
family does not sound too complex.

There should always be a good reason to introduce an incompatibility
between the kernel and the userspace implementation of a commonly
used API.

Best Regards,
Petr

2023-06-20 14:19:44

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:

...

> + %pj or another %p modifiers would be hard to understand either.
>
> Yes, we have %pe but I think that only few people really use it.
> And it is kind of self-explanatory because it is typically
> used together with ERR_PTR() and with variables called
> "err" or "ret".

j, besides the luck of no (yet) use in the kernel's printf(), is
described for printf(3)

j A following integer conversion corresponds to an intmax_t or uintmax_t
argument, or a following n conversion corresponds to a pointer to an
intmax_t argument.

So, I this among all proposals, this one is the best (while all of them may
sound not good).

--
With Best Regards,
Andy Shevchenko



2023-06-20 14:22:49

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Tue, Jun 20, 2023 at 04:52:43PM +0300, Andy Shevchenko wrote:
> On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> > On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:

...

> > + %pj or another %p modifiers would be hard to understand either.
> >
> > Yes, we have %pe but I think that only few people really use it.
> > And it is kind of self-explanatory because it is typically
> > used together with ERR_PTR() and with variables called
> > "err" or "ret".
>
> j, besides the luck of no (yet) use in the kernel's printf(), is
> described for printf(3)
>
> j A following integer conversion corresponds to an intmax_t or uintmax_t
> argument, or a following n conversion corresponds to a pointer to an
> intmax_t argument.
>
> So, I this among all proposals, this one is the best (while all of them may

s/this/think

> sound not good).

--
With Best Regards,
Andy Shevchenko



2023-06-20 15:17:58

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Tue 2023-06-20 16:52:42, Andy Shevchenko wrote:
> On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> > On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:
>
> ...
>
> > + %pj or another %p modifiers would be hard to understand either.
> >
> > Yes, we have %pe but I think that only few people really use it.
> > And it is kind of self-explanatory because it is typically
> > used together with ERR_PTR() and with variables called
> > "err" or "ret".
>
> j, besides the luck of no (yet) use in the kernel's printf(), is
> described for printf(3)
>
> j A following integer conversion corresponds to an intmax_t or uintmax_t
> argument, or a following n conversion corresponds to a pointer to an
> intmax_t argument.

I see, I have missed this coincidence. And we would really need to use %pj.
%jd requires intmax_t variable. Otherwise, the compiler produces:

kernel/lib/test.c:10:17: error: format ‘%jd’ expects argument of type ‘intmax_t *’, but argument 3 has type ‘int *’ [-Werror=format=]
sscanf(str, "%jd hello.", &tmp);

Hmm, %pj might even make some sense for sscanf() which requires pointers anyway.
But still, we would lose the compiler check of the size of the passed
buffer.

This is actually my concern with many other %p modifiers. The compiler
is not able to check that we pass the right pointer. I know that this
might happen even with wrong buffer passed to %s or so. But number
types is another category.


> So, I think among all proposals, this one is the best (while all of them may
> sound not good).

I still prefer the custom handler when it is not too complex.

Or if there are many users, we could create sscanf_strict() or so.

Best Regards,
Petr

2023-06-20 15:35:20

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Tue, Jun 20, 2023 at 04:57:55PM +0200, Petr Mladek wrote:
> On Tue 2023-06-20 16:52:42, Andy Shevchenko wrote:
> > On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> > > On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > > > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:

...

> > > + %pj or another %p modifiers would be hard to understand either.
> > >
> > > Yes, we have %pe but I think that only few people really use it.
> > > And it is kind of self-explanatory because it is typically
> > > used together with ERR_PTR() and with variables called
> > > "err" or "ret".
> >
> > j, besides the luck of no (yet) use in the kernel's printf(), is
> > described for printf(3)
> >
> > j A following integer conversion corresponds to an intmax_t or uintmax_t
> > argument, or a following n conversion corresponds to a pointer to an
> > intmax_t argument.
>
> I see, I have missed this coincidence. And we would really need to use %pj.
> %jd requires intmax_t variable. Otherwise, the compiler produces:
>
> kernel/lib/test.c:10:17: error: format ‘%jd’ expects argument of type ‘intmax_t *’, but argument 3 has type ‘int *’ [-Werror=format=]
> sscanf(str, "%jd hello.", &tmp);
>
> Hmm, %pj might even make some sense for sscanf() which requires pointers anyway.
> But still, we would lose the compiler check of the size of the passed
> buffer.
>
> This is actually my concern with many other %p modifiers. The compiler
> is not able to check that we pass the right pointer. I know that this
> might happen even with wrong buffer passed to %s or so. But number
> types is another category.

Yeah, it was a discussion IIRC for the compiler plugin to support %p
extensions, but I have no idea where it's now.

> > So, I think among all proposals, this one is the best (while all of them may
> > sound not good).
>
> I still prefer the custom handler when it is not too complex.
>
> Or if there are many users, we could create sscanf_strict() or so.

--
With Best Regards,
Andy Shevchenko



2023-06-21 01:42:05

by Demi Marie Obenour

[permalink] [raw]
Subject: Re: [PATCH v3 0/4] Make sscanf() stricter

On Tue, Jun 20, 2023 at 03:34:09PM +0200, Petr Mladek wrote:
> On Thu 2023-06-15 14:23:59, Andy Shevchenko wrote:
> > On Thu, Jun 15, 2023 at 08:06:46AM +0000, David Laight wrote:
> > > From: Demi Marie Obenour
> > > > Sent: 14 June 2023 21:09
> >
> > ...
> >
> > > > > What sort of formats and data are being used?
> > > >
> > > > Base-10 or base-16 integers, with whitespace never being valid.
> > >
> > > In which case sscanf() really isn't what you are looking for.
> > >
> > > > > The "%s" format terminates on whitespace.
> > > > > Even stroul() (and friends) will skip leading whitespace.
> > > >
> > > > Yes, which is a reason that strto*l() are just broken IMO.
> > >
> > > They are not 'broken', that is what is useful most of the time.
> > > The usual problem is that "020" is treated as octal.
>
> I do not know how many users depend on this behavior. But I believe
> that there are such users. And breaking compatibility with userspace
> implementation would make more harm then good in this case.
>
> > > > I’m trying to replace their uses in Xen with custom parsing code.
> > >
> > > Then write a custom parser :-)
>
> Honestly, I dislike any sscanf() modification which have been suggested
> so far:
>
> + %!d is not acceptable because it produces compiler errors
>
> + %d! is not acceptable because "use 64!" is a realistic string.
> We could not be sure that "<number>!" will never be parsed
> in kernel.
>
> + %d%[!] produces compiler error either. It is hard to parse by eyes.
> Also the meaning of such a format would be far from obvious.
>
> + %pj or another %p modifiers would be hard to understand either.
>
> Yes, we have %pe but I think that only few people really use it.
> And it is kind of self-explanatory because it is typically
> used together with ERR_PTR() and with variables called
> "err" or "ret".
>
>
> > Hmm... Usually we are against zillion implementations of the same with zillion
> > bugs hidden (each buggy implementation with its own bugs).
>
> I would really like to see the code depending on it. The cover letter
> suggests that there already is a patch with such a custom parser.
> I am sorry if it has already been mentioned. There were so many threads.
>
> Sure, we do not want two full featured sscanf() implementations. But a
> wrapper checking for leading whitespace and using kstrto<foo>
> family does not sound too complex.
>
> There should always be a good reason to introduce an incompatibility
> between the kernel and the userspace implementation of a commonly
> used API.
>
> Best Regards,
> Petr

I strongly believe that overflow should be forbidden by default, but it
turns out that I do not have time to advance this patch further. My
understanding is that Xen never wants to allow spaces in Xenstore
entries, but that is easy to ensure via an explicit check prior to
calling vsscanf().
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


Attachments:
(No filename) (2.98 kB)
signature.asc (849.00 B)
Download all attachments