2022-12-22 15:01:02

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH] string.c: test *cmp for all possible 1-character strings

The switch to -funsigned-char made a pre-existing bug on m68k more
apparent. That is now fixed (by removing m68k's private strcmp(), see
commit 7c0846125358), but we still have quite a few architectures that
provide one or more of strcmp(), strncmp() and memcmp().

They probably all work fine for the cases where the input is all
ASCII, and/or where the caller only wants to know about equality or
not (i.e. only checks whether the return value is 0 or not).

Let's check that all these implementations also behave correctly for
bytes with the high bit set, and provide the correct ordering -
independent of us now building with -funsigned-char, the C standard
says that these *cmp functions should consider the buffers as
consisting of unsigned chars.

This is only intended to help find other latent bugs and can/should be
ripped out again before v6.2, or perhaps moved to test_string.c in
some form, but for now I think it's worth doing unconditionally.

Signed-off-by: Rasmus Villemoes <[email protected]>
---
lib/string.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/lib/string.c b/lib/string.c
index 4fb566ea610f..1718f96e8082 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -880,3 +880,30 @@ void *memchr_inv(const void *start, int c, size_t bytes)
return check_bytes8(start, value, bytes % 8);
}
EXPORT_SYMBOL(memchr_inv);
+
+static int sign(int x)
+{
+ return (x > 0) - (x < 0);
+}
+
+static int test_xxxcmp(void)
+{
+ char a[2], b[2];
+ int i, j;
+
+ a[1] = b[1] = 0;
+ for (i = 0; i < 256; ++i) {
+ a[0] = i;
+ for (j = 0; j < 256; ++j) {
+ b[0] = j;
+ WARN_ONCE(sign(strcmp(a, b)) != sign(i - j),
+ "strcmp() broken for (%2ph, %2ph)\n", a, b);
+ WARN_ONCE(sign(memcmp(a, b, 2)) != sign(i - j),
+ "memcmp() broken for (%2ph, %2ph)\n", a, b);
+ WARN_ONCE(sign(strncmp(a, b, 2)) != sign(i - j),
+ "strncmp() broken for (%2ph, %2ph)\n", a, b);
+ }
+ }
+ return 0;
+}
+late_initcall(test_xxxcmp);
--
2.37.2


2022-12-22 15:42:06

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [PATCH] string.c: test *cmp for all possible 1-character strings

On Thu, Dec 22, 2022 at 03:05:06PM +0100, Rasmus Villemoes wrote:
> The switch to -funsigned-char made a pre-existing bug on m68k more
> apparent. That is now fixed (by removing m68k's private strcmp(), see
> commit 7c0846125358), but we still have quite a few architectures that
> provide one or more of strcmp(), strncmp() and memcmp().
>
> They probably all work fine for the cases where the input is all
> ASCII, and/or where the caller only wants to know about equality or
> not (i.e. only checks whether the return value is 0 or not).
>
> Let's check that all these implementations also behave correctly for
> bytes with the high bit set, and provide the correct ordering -
> independent of us now building with -funsigned-char, the C standard
> says that these *cmp functions should consider the buffers as
> consisting of unsigned chars.
>
> This is only intended to help find other latent bugs and can/should be
> ripped out again before v6.2, or perhaps moved to test_string.c in
> some form, but for now I think it's worth doing unconditionally.
>
> Signed-off-by: Rasmus Villemoes <[email protected]>
> ---
> lib/string.c | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/lib/string.c b/lib/string.c
> index 4fb566ea610f..1718f96e8082 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -880,3 +880,30 @@ void *memchr_inv(const void *start, int c, size_t bytes)
> return check_bytes8(start, value, bytes % 8);
> }
> EXPORT_SYMBOL(memchr_inv);
> +
> +static int sign(int x)
> +{
> + return (x > 0) - (x < 0);
> +}
> +
> +static int test_xxxcmp(void)
> +{
> + char a[2], b[2];
> + int i, j;
> +
> + a[1] = b[1] = 0;
> + for (i = 0; i < 256; ++i) {
> + a[0] = i;
> + for (j = 0; j < 256; ++j) {
> + b[0] = j;
> + WARN_ONCE(sign(strcmp(a, b)) != sign(i - j),
> + "strcmp() broken for (%2ph, %2ph)\n", a, b);
> + WARN_ONCE(sign(memcmp(a, b, 2)) != sign(i - j),
> + "memcmp() broken for (%2ph, %2ph)\n", a, b);
> + WARN_ONCE(sign(strncmp(a, b, 2)) != sign(i - j),
> + "strncmp() broken for (%2ph, %2ph)\n", a, b);
> + }
> + }
> + return 0;
> +}
> +late_initcall(test_xxxcmp);

This probably belongs in some config-gated selftest file that can be
compiled out, rather than running unconditionally on every boot, right?

Jason

2022-12-23 08:02:46

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH] string.c: test *cmp for all possible 1-character strings

On 22/12/2022 16.15, Jason A. Donenfeld wrote:
> On Thu, Dec 22, 2022 at 03:05:06PM +0100, Rasmus Villemoes wrote:

>> This is only intended to help find other latent bugs and can/should be
>> ripped out again before v6.2, or perhaps moved to test_string.c in
>> some form, but for now I think it's worth doing unconditionally.
>>
> This probably belongs in some config-gated selftest file that can be
> compiled out, rather than running unconditionally on every boot, right?

I believe this was already answered in the last paragraph of the commit log.

Rasmus

2022-12-23 08:08:57

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] string.c: test *cmp for all possible 1-character strings

Hi Rasmus,

I love your patch! Yet something to improve:

[auto build test ERROR on linux/master]
[also build test ERROR on linus/master v6.1 next-20221220]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Rasmus-Villemoes/string-c-test-cmp-for-all-possible-1-character-strings/20221222-220708
patch link: https://lore.kernel.org/r/20221222140506.1961281-1-linux%40rasmusvillemoes.dk
patch subject: [PATCH] string.c: test *cmp for all possible 1-character strings
config: riscv-randconfig-r042-20221219
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 98b13979fb05f3ed288a900deb843e7b27589e58)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install riscv cross compiling tool for clang build
# apt-get install binutils-riscv64-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/0235c6544a848ef03332c7840c87b356c08a4b1d
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Rasmus-Villemoes/string-c-test-cmp-for-all-possible-1-character-strings/20221222-220708
git checkout 0235c6544a848ef03332c7840c87b356c08a4b1d
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=riscv olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> ld.lld: error: undefined symbol: __warn_printk
>>> referenced by ctype.c
>>> arch/riscv/purgatory/purgatory.ro:(test_xxxcmp)
>>> referenced by ctype.c
>>> arch/riscv/purgatory/purgatory.ro:(test_xxxcmp)
>>> referenced by ctype.c
>>> arch/riscv/purgatory/purgatory.ro:(test_xxxcmp)

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (2.33 kB)
config (151.36 kB)
Download all attachments

2022-12-23 22:37:57

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] string.c: test *cmp for all possible 1-character strings

Hi Rasmus,

I love your patch! Yet something to improve:

[auto build test ERROR on linux/master]
[also build test ERROR on linus/master v6.1 next-20221220]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Rasmus-Villemoes/string-c-test-cmp-for-all-possible-1-character-strings/20221222-220708
patch link: https://lore.kernel.org/r/20221222140506.1961281-1-linux%40rasmusvillemoes.dk
patch subject: [PATCH] string.c: test *cmp for all possible 1-character strings
config: riscv-allyesconfig
compiler: riscv64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/0235c6544a848ef03332c7840c87b356c08a4b1d
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Rasmus-Villemoes/string-c-test-cmp-for-all-possible-1-character-strings/20221222-220708
git checkout 0235c6544a848ef03332c7840c87b356c08a4b1d
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

riscv64-linux-ld: arch/riscv/purgatory/purgatory.ro: in function `.L13':
>> string.c:(.text+0x1832): undefined reference to `__warn_printk'
riscv64-linux-ld: arch/riscv/purgatory/purgatory.ro: in function `.L3':
string.c:(.text+0x187a): undefined reference to `__warn_printk'
riscv64-linux-ld: arch/riscv/purgatory/purgatory.ro: in function `.L6':
string.c:(.text+0x18c4): undefined reference to `__warn_printk'

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (2.22 kB)
config (336.85 kB)
Download all attachments