LinuxLists.cc - [PATCH] unicode: don't write -1 after NULL terminator

2022-11-03 02:05:08

Subject: [PATCH] unicode: don't write -1 after NULL terminator

If the intention is to overwrite the first NULL with a -1, s[strlen(s)]
is the first NULL, not s[strlen(s)+1].

Cc: Gabriel Krisman Bertazi <[email protected]>
Cc: [email protected]
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
fs/unicode/mkutf8data.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
index bc1a7c8b5c8d..61800e0d3226 100644
--- a/fs/unicode/mkutf8data.c
+++ b/fs/unicode/mkutf8data.c
@@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
/* Second test: length-limited string. */
s = buf2;
/* Replace NUL with a value that will cause an error if seen. */
- s[strlen(s) + 1] = -1;
+ s[strlen(s)] = -1;
t = buf3;
if (utf8cursor(&u8c, tree, s))
return -1;
--
2.38.1

2022-11-03 07:25:48

by Jiri Slaby

[permalink] [raw]

Subject: Re: [PATCH] unicode: don't write -1 after NULL terminator

On 03. 11. 22, 2:24, Jason A. Donenfeld wrote:
> If the intention is to overwrite the first NULL with a -1, s[strlen(s)]
> is the first NULL, not s[strlen(s)+1].

This caught my attention. You mix NULL (void *) with NUL (\0) in the
changelog & subject. That occurs rather confusing to me.

> Cc: Gabriel Krisman Bertazi <[email protected]>
> Cc: [email protected]
> Signed-off-by: Jason A. Donenfeld <[email protected]>
> ---
> fs/unicode/mkutf8data.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
> index bc1a7c8b5c8d..61800e0d3226 100644
> --- a/fs/unicode/mkutf8data.c
> +++ b/fs/unicode/mkutf8data.c
> @@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
> /* Second test: length-limited string. */
> s = buf2;
> /* Replace NUL with a value that will cause an error if seen. */
> - s[strlen(s) + 1] = -1;
> + s[strlen(s)] = -1;
> t = buf3;
> if (utf8cursor(&u8c, tree, s))
> return -1;

--
js
suse labs

2022-11-03 11:39:08

by Jason A. Donenfeld

[permalink] [raw]

Subject: [PATCH v2] unicode: don't write -1 after NUL terminator

If the intention is to overwrite the first NUL with a -1, s[strlen(s)]
is the first NUL, not s[strlen(s)+1].

Cc: Gabriel Krisman Bertazi <[email protected]>
Cc: [email protected]
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
fs/unicode/mkutf8data.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
index bc1a7c8b5c8d..61800e0d3226 100644
--- a/fs/unicode/mkutf8data.c
+++ b/fs/unicode/mkutf8data.c
@@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
/* Second test: length-limited string. */
s = buf2;
/* Replace NUL with a value that will cause an error if seen. */
- s[strlen(s) + 1] = -1;
+ s[strlen(s)] = -1;
t = buf3;
if (utf8cursor(&u8c, tree, s))
return -1;
--
2.38.1

2022-11-07 15:56:56

by Jason A. Donenfeld

[permalink] [raw]

Subject: Re: [PATCH v2] unicode: don't write -1 after NUL terminator

Hi Gabriel,

On Mon, Nov 07, 2022 at 09:45:25AM -0500, Gabriel Krisman Bertazi wrote:
> "Jason A. Donenfeld" <[email protected]> writes:
>
> > If the intention is to overwrite the first NUL with a -1, s[strlen(s)]
> > is the first NUL, not s[strlen(s)+1].
>
> Hi Jason,
>
> This code is part of the verification of the trie that done at the end
> of utf8data generation. It is making sure the tree is not corrupted, by
> ensuring that utf8byte doesn't see something past the correct end of the
> string (the first NULL byte). Note it is not a bad memory access
> either, since we guarantee to have allocated enough space.
>
> So I think the code is correct as is. if you apply your patch and
> regenerate utf8data.h_shipped, utf8byte will reach that -1 and fail the
> verification.

Ah, okay. "Replace NUL" would seem to be wrong/confusing comment text I
suppose. Thanks for the explanation anyhow, and sorry for the noise.

Jason

>
> > Cc: Gabriel Krisman Bertazi <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Jason A. Donenfeld <[email protected]>
> > ---
> > fs/unicode/mkutf8data.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
> > index bc1a7c8b5c8d..61800e0d3226 100644
> > --- a/fs/unicode/mkutf8data.c
> > +++ b/fs/unicode/mkutf8data.c
> > @@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
> > /* Second test: length-limited string. */
> > s = buf2;
> > /* Replace NUL with a value that will cause an error if seen. */
> > - s[strlen(s) + 1] = -1;
> > + s[strlen(s)] = -1;
> > t = buf3;
> > if (utf8cursor(&u8c, tree, s))
> > return -1;
>
> --
> Gabriel Krisman Bertazi

2022-11-07 16:04:10

by Gabriel Krisman Bertazi

[permalink] [raw]

Subject: Re: [PATCH v2] unicode: don't write -1 after NUL terminator

"Jason A. Donenfeld" <[email protected]> writes:

> If the intention is to overwrite the first NUL with a -1, s[strlen(s)]
> is the first NUL, not s[strlen(s)+1].

Hi Jason,

This code is part of the verification of the trie that done at the end
of utf8data generation. It is making sure the tree is not corrupted, by
ensuring that utf8byte doesn't see something past the correct end of the
string (the first NULL byte). Note it is not a bad memory access
either, since we guarantee to have allocated enough space.

So I think the code is correct as is. if you apply your patch and
regenerate utf8data.h_shipped, utf8byte will reach that -1 and fail the
verification.

> Cc: Gabriel Krisman Bertazi <[email protected]>
> Cc: [email protected]
> Signed-off-by: Jason A. Donenfeld <[email protected]>
> ---
> fs/unicode/mkutf8data.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
> index bc1a7c8b5c8d..61800e0d3226 100644
> --- a/fs/unicode/mkutf8data.c
> +++ b/fs/unicode/mkutf8data.c
> @@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
> /* Second test: length-limited string. */
> s = buf2;
> /* Replace NUL with a value that will cause an error if seen. */
> - s[strlen(s) + 1] = -1;
> + s[strlen(s)] = -1;
> t = buf3;
> if (utf8cursor(&u8c, tree, s))
> return -1;

--
Gabriel Krisman Bertazi