2023-08-07 21:41:14

by Justin Stitt

[permalink] [raw]
Subject: [PATCH] um: refactor deprecated strncpy to strtomem

Use `strtomem` here since `console_buf` is not expected to be
NUL-terminated. We should probably also just use `MCONSOLE_MAX_DATA`
instead of using `ARRAY_SIZE()` for every iteration of the loop.

Also mark char buffer as `__nonstring` as per Kees' suggestion here [1]

Finally, follow checkpatch's recommendation of using `min_t` over `min`

Link: https://github.com/KSPP/linux/issues/90 [1]
Cc: [email protected]
Signed-off-by: Justin Stitt <[email protected]>
---
Notes:
I only build tested this patch.
---
arch/um/drivers/mconsole_kern.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c
index 5026e7b9adfe..fd4c024202ae 100644
--- a/arch/um/drivers/mconsole_kern.c
+++ b/arch/um/drivers/mconsole_kern.c
@@ -4,6 +4,7 @@
* Copyright (C) 2001 - 2008 Jeff Dike (jdike@{addtoit,linux.intel}.com)
*/

+#include "linux/compiler_attributes.h"
#include <linux/console.h>
#include <linux/ctype.h>
#include <linux/string.h>
@@ -554,7 +555,7 @@ struct mconsole_output {

static DEFINE_SPINLOCK(client_lock);
static LIST_HEAD(clients);
-static char console_buf[MCONSOLE_MAX_DATA];
+static char console_buf[MCONSOLE_MAX_DATA] __nonstring;

static void console_write(struct console *console, const char *string,
unsigned int len)
@@ -566,8 +567,8 @@ static void console_write(struct console *console, const char *string,
return;

while (len > 0) {
- n = min((size_t) len, ARRAY_SIZE(console_buf));
- strncpy(console_buf, string, n);
+ n = min_t(size_t, len, MCONSOLE_MAX_DATA);
+ strtomem(console_buf, string);
string += n;
len -= n;


---
base-commit: c1a515d3c0270628df8ae5f5118ba859b85464a2
change-id: 20230807-arch-um-3ef24413427e

Best regards,
--
Justin Stitt <[email protected]>



2023-08-07 23:07:48

by Bill Wendling

[permalink] [raw]
Subject: Re: [PATCH] um: refactor deprecated strncpy to strtomem

On Mon, Aug 7, 2023 at 2:18 PM Justin Stitt <[email protected]> wrote:
>
> Use `strtomem` here since `console_buf` is not expected to be
> NUL-terminated. We should probably also just use `MCONSOLE_MAX_DATA`
> instead of using `ARRAY_SIZE()` for every iteration of the loop.
>
Is this change necessary? I have a general preference for ARRAY_SIZE,
because a change in size is less likely to be overlooked (unless that
goes against the coding standard).

> Also mark char buffer as `__nonstring` as per Kees' suggestion here [1]
>
> Finally, follow checkpatch's recommendation of using `min_t` over `min`
>
> Link: https://github.com/KSPP/linux/issues/90 [1]
> Cc: [email protected]
> Signed-off-by: Justin Stitt <[email protected]>
> ---
> Notes:
> I only build tested this patch.
> ---
> arch/um/drivers/mconsole_kern.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c
> index 5026e7b9adfe..fd4c024202ae 100644
> --- a/arch/um/drivers/mconsole_kern.c
> +++ b/arch/um/drivers/mconsole_kern.c
> @@ -4,6 +4,7 @@
> * Copyright (C) 2001 - 2008 Jeff Dike (jdike@{addtoit,linux.intel}.com)
> */
>
> +#include "linux/compiler_attributes.h"

nit: Should this include be in angle brackets?

#include <linux/compiler_attributes.h>

> #include <linux/console.h>
> #include <linux/ctype.h>
> #include <linux/string.h>
> @@ -554,7 +555,7 @@ struct mconsole_output {
>
> static DEFINE_SPINLOCK(client_lock);
> static LIST_HEAD(clients);
> -static char console_buf[MCONSOLE_MAX_DATA];
> +static char console_buf[MCONSOLE_MAX_DATA] __nonstring;
>
> static void console_write(struct console *console, const char *string,
> unsigned int len)
> @@ -566,8 +567,8 @@ static void console_write(struct console *console, const char *string,
> return;
>
> while (len > 0) {
> - n = min((size_t) len, ARRAY_SIZE(console_buf));
> - strncpy(console_buf, string, n);
> + n = min_t(size_t, len, MCONSOLE_MAX_DATA);
> + strtomem(console_buf, string);
> string += n;
> len -= n;
>
>
> ---
> base-commit: c1a515d3c0270628df8ae5f5118ba859b85464a2
> change-id: 20230807-arch-um-3ef24413427e
>
> Best regards,
> --
> Justin Stitt <[email protected]>
>

2023-08-08 02:02:53

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] um: refactor deprecated strncpy to strtomem

On Mon, Aug 07, 2023 at 03:36:55PM -0700, Bill Wendling wrote:
> On Mon, Aug 7, 2023 at 2:18 PM Justin Stitt <[email protected]> wrote:
> >
> > Use `strtomem` here since `console_buf` is not expected to be
> > NUL-terminated. We should probably also just use `MCONSOLE_MAX_DATA`

How is it known that console_buf is not a C-string?

> > instead of using `ARRAY_SIZE()` for every iteration of the loop.
> >
> Is this change necessary? I have a general preference for ARRAY_SIZE,
> because a change in size is less likely to be overlooked (unless that
> goes against the coding standard).

I would prefer this stay either ARRAY_SIZE or sizeof, as it keeps it
tied to the variable in question.

>
> > Also mark char buffer as `__nonstring` as per Kees' suggestion here [1]
> >
> > Finally, follow checkpatch's recommendation of using `min_t` over `min`
> >
> > Link: https://github.com/KSPP/linux/issues/90 [1]
> > Cc: [email protected]
> > Signed-off-by: Justin Stitt <[email protected]>
> > ---
> > Notes:
> > I only build tested this patch.
> > ---
> > arch/um/drivers/mconsole_kern.c | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c
> > index 5026e7b9adfe..fd4c024202ae 100644
> > --- a/arch/um/drivers/mconsole_kern.c
> > +++ b/arch/um/drivers/mconsole_kern.c
> > @@ -4,6 +4,7 @@
> > * Copyright (C) 2001 - 2008 Jeff Dike (jdike@{addtoit,linux.intel}.com)
> > */
> >
> > +#include "linux/compiler_attributes.h"
>
> nit: Should this include be in angle brackets?
>
> #include <linux/compiler_attributes.h>

True, though this shouldn't need to be included at all. What was
missing?

>
> > #include <linux/console.h>
> > #include <linux/ctype.h>
> > #include <linux/string.h>
> > @@ -554,7 +555,7 @@ struct mconsole_output {
> >
> > static DEFINE_SPINLOCK(client_lock);
> > static LIST_HEAD(clients);
> > -static char console_buf[MCONSOLE_MAX_DATA];
> > +static char console_buf[MCONSOLE_MAX_DATA] __nonstring;
> >
> > static void console_write(struct console *console, const char *string,
> > unsigned int len)
> > @@ -566,8 +567,8 @@ static void console_write(struct console *console, const char *string,
> > return;
> >
> > while (len > 0) {
> > - n = min((size_t) len, ARRAY_SIZE(console_buf));
> > - strncpy(console_buf, string, n);
> > + n = min_t(size_t, len, MCONSOLE_MAX_DATA);
> > + strtomem(console_buf, string);
> > string += n;
> > len -= n;
> >
> >
> > ---
> > base-commit: c1a515d3c0270628df8ae5f5118ba859b85464a2
> > change-id: 20230807-arch-um-3ef24413427e
> >
> > Best regards,
> > --
> > Justin Stitt <[email protected]>
> >

--
Kees Cook

2023-08-08 21:59:27

by Justin Stitt

[permalink] [raw]
Subject: Re: [PATCH] um: refactor deprecated strncpy to strtomem

On Mon, Aug 7, 2023 at 4:40 PM Kees Cook <[email protected]> wrote:
>
> On Mon, Aug 07, 2023 at 03:36:55PM -0700, Bill Wendling wrote:
> > On Mon, Aug 7, 2023 at 2:18 PM Justin Stitt <[email protected]> wrote:
> > >
> > > Use `strtomem` here since `console_buf` is not expected to be
> > > NUL-terminated. We should probably also just use `MCONSOLE_MAX_DATA`
>
> How is it known that console_buf is not a C-string?
There are a few clues that led me to believe console_buf was not a C-string:
1) `n = min((size_t) len, ARRAY_SIZE(console_buf));` means that `n`
can be equal to size of buffer which is then subsequently filled
entirely by the `strncpy` invocation.
2) console_buf looks to be a circular buffer wherein once it's filled
it starts again from the beginning.
3) ARRAY_SIZE is used on it as opposed to strlen or something like
that (but not sure if ARRAY_SIZE is also used on C-strings to be fair)
4) It has `buf` in its name which I loosely associate with non
C-strings char buffers.

All in all, it looks to be a non C-string but honestly it's hard to
tell, especially since if it IS a C-string the previous implementation
had some bugs with strncpy I believe.

>
> > > instead of using `ARRAY_SIZE()` for every iteration of the loop.
> > >
> > Is this change necessary? I have a general preference for ARRAY_SIZE,
> > because a change in size is less likely to be overlooked (unless that
> > goes against the coding standard).
>
> I would prefer this stay either ARRAY_SIZE or sizeof, as it keeps it
> tied to the variable in question.
>
> >
> > > Also mark char buffer as `__nonstring` as per Kees' suggestion here [1]
> > >
> > > Finally, follow checkpatch's recommendation of using `min_t` over `min`
> > >
> > > Link: https://github.com/KSPP/linux/issues/90 [1]
> > > Cc: [email protected]
> > > Signed-off-by: Justin Stitt <[email protected]>
> > > ---
> > > Notes:
> > > I only build tested this patch.
> > > ---
> > > arch/um/drivers/mconsole_kern.c | 7 ++++---
> > > 1 file changed, 4 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c
> > > index 5026e7b9adfe..fd4c024202ae 100644
> > > --- a/arch/um/drivers/mconsole_kern.c
> > > +++ b/arch/um/drivers/mconsole_kern.c
> > > @@ -4,6 +4,7 @@
> > > * Copyright (C) 2001 - 2008 Jeff Dike (jdike@{addtoit,linux.intel}.com)
> > > */
> > >
> > > +#include "linux/compiler_attributes.h"
> >
> > nit: Should this include be in angle brackets?
> >
> > #include <linux/compiler_attributes.h>
>
> True, though this shouldn't need to be included at all. What was
> missing?
>
> >
> > > #include <linux/console.h>
> > > #include <linux/ctype.h>
> > > #include <linux/string.h>
> > > @@ -554,7 +555,7 @@ struct mconsole_output {
> > >
> > > static DEFINE_SPINLOCK(client_lock);
> > > static LIST_HEAD(clients);
> > > -static char console_buf[MCONSOLE_MAX_DATA];
> > > +static char console_buf[MCONSOLE_MAX_DATA] __nonstring;
> > >
> > > static void console_write(struct console *console, const char *string,
> > > unsigned int len)
> > > @@ -566,8 +567,8 @@ static void console_write(struct console *console, const char *string,
> > > return;
> > >
> > > while (len > 0) {
> > > - n = min((size_t) len, ARRAY_SIZE(console_buf));
> > > - strncpy(console_buf, string, n);
> > > + n = min_t(size_t, len, MCONSOLE_MAX_DATA);
> > > + strtomem(console_buf, string);
> > > string += n;
> > > len -= n;
> > >
> > >
> > > ---
> > > base-commit: c1a515d3c0270628df8ae5f5118ba859b85464a2
> > > change-id: 20230807-arch-um-3ef24413427e
> > >
> > > Best regards,
> > > --
> > > Justin Stitt <[email protected]>
> > >
>
> --
> Kees Cook

2023-08-09 01:08:40

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] um: refactor deprecated strncpy to strtomem

On Tue, Aug 08, 2023 at 10:28:57AM -0700, Justin Stitt wrote:
> On Mon, Aug 7, 2023 at 4:40 PM Kees Cook <[email protected]> wrote:
> >
> > On Mon, Aug 07, 2023 at 03:36:55PM -0700, Bill Wendling wrote:
> > > On Mon, Aug 7, 2023 at 2:18 PM Justin Stitt <[email protected]> wrote:
> > > >
> > > > Use `strtomem` here since `console_buf` is not expected to be
> > > > NUL-terminated. We should probably also just use `MCONSOLE_MAX_DATA`
> >
> > How is it known that console_buf is not a C-string?
> There are a few clues that led me to believe console_buf was not a C-string:
> 1) `n = min((size_t) len, ARRAY_SIZE(console_buf));` means that `n`
> can be equal to size of buffer which is then subsequently filled
> entirely by the `strncpy` invocation.
> 2) console_buf looks to be a circular buffer wherein once it's filled
> it starts again from the beginning.
> 3) ARRAY_SIZE is used on it as opposed to strlen or something like
> that (but not sure if ARRAY_SIZE is also used on C-strings to be fair)
> 4) It has `buf` in its name which I loosely associate with non
> C-strings char buffers.
>
> All in all, it looks to be a non C-string but honestly it's hard to
> tell, especially since if it IS a C-string the previous implementation
> had some bugs with strncpy I believe.

I took a look at this. It's only used in that one function, and it's
always working from the start, and uses at max ARRAY_SIZE(console_buf),
as you mentioned. Then it's passed to mconsole_reply_len() with "len",
so we can examine that:

do {
...
len = MIN(total, MCONSOLE_MAX_DATA - 1);
...
memcpy(reply.data, str, len);
reply.data[len] = '\0';
total -= len;
...
} while (total > 0);

It's sending it as NUL-terminated, possibly breaking it up. If this used
the whole MCONSOLE_MAX_DATA, it would send MCONSOLE_MAX_DATA - 1 bytes
followed by NUL, and then 1 byte, followed by NUL. :P

Anyway, point being, yes, it appears that console_buf is a nonstring.
But it's a weird one...

In your v2 patch, you use strtomem(), which is probably close, but in
looking at the implementation details here, I'm starting to think that
strtomem() needs to return the number of bytes copied. Initially I was
thinking it could actually just be replaced with memcpy(), but there
are some side-effects going on in the original code.

First:

static void console_write(..., const char *string, unsigned int len)
...
n = min((size_t) len, ARRAY_SIZE(console_buf));
strncpy(console_buf, string, n);

The 'n' is being passed in, so it's possible that "string" has NUL-bytes
in it. (Though I would assume, unlikely -- I haven't tracked down the
callers of console_write() here...)

That means that strncpy() will stop the copy at the first NUL and
then NUL pad the rest to 'n', and that's what gets passed down to
mconsole_reply_len() (which is specifically using memcpy and will carry
those NUL bytes forward). So we should not lose the NUL-padding behavior.

Additionally, when "len" is smaller than MCONSOLE_MAX_DATA, the padding
will only go up to "len", not MCONSOLE_MAX_DATA. So there's a weird
behavioral difference here where the new code is doing more work, but,
okay fine.

I find this original code to be rather confused about whether it is
dealing with C-strings or byte arrays. :|

So now I wanted to find the callers of struct console::write. My
Coccinelle script:

@found@
struct console *CON;
@@

* CON->write(...)

show hits in kernel/printk/printk.c, which is dealing with a
NUL-terminated string constructed by vsnprintf():

console_emit_next_record

Technically there could be NUL bytes in such a string, but almost
everything else expects to be dealing with C-strings here. This looks
really fragile.

So, I'm back around to thinking this should just be memcpy():

- strncpy(console_buf, string, n);
+ memcpy(console_buf, string, n);


--
Kees Cook