2021-09-17 07:20:02

by Tobias Klauser

[permalink] [raw]
Subject: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf

The changes in the patch series [1] introduced a terminating null byte
when reading from cpulist or cpumap sysfs files, for example:

$ xxd /sys/devices/system/node/node0/cpulist
00000000: 302d 310a 00 0-1..

Before this change, the output looked as follows:

$ xxd /sys/devices/system/node/node0/cpulist
00000000: 302d 310a 0-1.

Fix this regression by excluding the terminating null byte from the
returned length in cpumap_print_list_to_buf and
cpumap_print_bitmask_to_buf.

[1] https://lore.kernel.org/all/[email protected]/

Fixes: 1fae562983ca ("cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list")
Signed-off-by: Tobias Klauser <[email protected]>
---
include/linux/cpumask.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 5d4d07a9e1ed..1e7399fc69c0 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -996,14 +996,15 @@ cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
* cpumask; Typically used by bin_attribute to export cpumask bitmask
* ABI.
*
- * Returns the length of how many bytes have been copied.
+ * Returns the length of how many bytes have been copied, excluding
+ * terminating '\0'.
*/
static inline ssize_t
cpumap_print_bitmask_to_buf(char *buf, const struct cpumask *mask,
loff_t off, size_t count)
{
return bitmap_print_bitmask_to_buf(buf, cpumask_bits(mask),
- nr_cpu_ids, off, count);
+ nr_cpu_ids, off, count) - 1;
}

/**
@@ -1018,7 +1019,7 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask *mask,
loff_t off, size_t count)
{
return bitmap_print_list_to_buf(buf, cpumask_bits(mask),
- nr_cpu_ids, off, count);
+ nr_cpu_ids, off, count) - 1;
}

#if NR_CPUS <= BITS_PER_LONG
--
2.33.0


Subject: RE: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf



> -----Original Message-----
> From: Tobias Klauser [mailto:[email protected]]
> Sent: Friday, September 17, 2021 10:27 AM
> To: Greg Kroah-Hartman <[email protected]>; Jonathan Cameron
> <[email protected]>; tiantao (H) <[email protected]>; Song Bao
> Hua (Barry Song) <[email protected]>
> Cc: Andrew Morton <[email protected]>; Andy Shevchenko
> <[email protected]>; Yury Norov <[email protected]>; Peter
> Zijlstra <[email protected]>; [email protected]
> Subject: [PATCH] cpumask: Omit terminating null byte in
> cpumap_print_{list,bitmask}_to_buf
>
> The changes in the patch series [1] introduced a terminating null byte
> when reading from cpulist or cpumap sysfs files, for example:
>
> $ xxd /sys/devices/system/node/node0/cpulist
> 00000000: 302d 310a 00 0-1..
>
> Before this change, the output looked as follows:
>
> $ xxd /sys/devices/system/node/node0/cpulist
> 00000000: 302d 310a 0-1.

If we don't use xxd, I don't see any actual harm of this NULL byte
by cat, lscpu, numactl etc. this doesn't break them at all.

if we only want to make sure the output is exactly same with before
for every single character, this patch is right.

>
> Fix this regression by excluding the terminating null byte from the
> returned length in cpumap_print_list_to_buf and
> cpumap_print_bitmask_to_buf.
>
> [1]
> https://lore.kernel.org/all/[email protected]
> m/
>
> Fixes: 1fae562983ca ("cpumask: introduce cpumap_print_list/bitmask_to_buf to
> support large bitmask and list")
> Signed-off-by: Tobias Klauser <[email protected]>
> ---
> include/linux/cpumask.h | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 5d4d07a9e1ed..1e7399fc69c0 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -996,14 +996,15 @@ cpumap_print_to_pagebuf(bool list, char *buf, const struct
> cpumask *mask)
> * cpumask; Typically used by bin_attribute to export cpumask bitmask
> * ABI.
> *
> - * Returns the length of how many bytes have been copied.
> + * Returns the length of how many bytes have been copied, excluding
> + * terminating '\0'.
> */
> static inline ssize_t
> cpumap_print_bitmask_to_buf(char *buf, const struct cpumask *mask,
> loff_t off, size_t count)
> {
> return bitmap_print_bitmask_to_buf(buf, cpumask_bits(mask),
> - nr_cpu_ids, off, count);
> + nr_cpu_ids, off, count) - 1;
> }
>
> /**
> @@ -1018,7 +1019,7 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask
> *mask,
> loff_t off, size_t count)
> {
> return bitmap_print_list_to_buf(buf, cpumask_bits(mask),
> - nr_cpu_ids, off, count);
> + nr_cpu_ids, off, count) - 1;
> }
>
> #if NR_CPUS <= BITS_PER_LONG
> --
> 2.33.0

Thanks
Barry

2021-09-17 10:45:37

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf

[CC Greg KH <[email protected]>]

On Thu, Sep 16, 2021 at 10:53:39PM +0000, Song Bao Hua (Barry Song) wrote:
>
>
> > -----Original Message-----
> > From: Tobias Klauser [mailto:[email protected]]
> > Sent: Friday, September 17, 2021 10:27 AM
> > To: Greg Kroah-Hartman <[email protected]>; Jonathan Cameron
> > <[email protected]>; tiantao (H) <[email protected]>; Song Bao
> > Hua (Barry Song) <[email protected]>
> > Cc: Andrew Morton <[email protected]>; Andy Shevchenko
> > <[email protected]>; Yury Norov <[email protected]>; Peter
> > Zijlstra <[email protected]>; [email protected]
> > Subject: [PATCH] cpumask: Omit terminating null byte in
> > cpumap_print_{list,bitmask}_to_buf
> >
> > The changes in the patch series [1] introduced a terminating null byte
> > when reading from cpulist or cpumap sysfs files, for example:
> >
> > $ xxd /sys/devices/system/node/node0/cpulist
> > 00000000: 302d 310a 00 0-1..
> >
> > Before this change, the output looked as follows:
> >
> > $ xxd /sys/devices/system/node/node0/cpulist
> > 00000000: 302d 310a 0-1.
>
> If we don't use xxd, I don't see any actual harm of this NULL byte
> by cat, lscpu, numactl etc. this doesn't break them at all.

Barry, Tobias' script that uses xxd is userspace. Linux kernel never breaks
userspace.

> if we only want to make sure the output is exactly same with before
> for every single character, this patch is right.

We don't want to make the output exactly the same. The "0,1" would
also work for the example above. But garbage characters following \0
is a bug that should be fixed.

Greg, would you like to move this patch through your tree?

Acked-by: Yury Norov <[email protected]>

> > Fix this regression by excluding the terminating null byte from the
> > returned length in cpumap_print_list_to_buf and
> > cpumap_print_bitmask_to_buf.
> >
> > [1]
> > https://lore.kernel.org/all/[email protected]
> > m/
> >
> > Fixes: 1fae562983ca ("cpumask: introduce cpumap_print_list/bitmask_to_buf to
> > support large bitmask and list")
> > Signed-off-by: Tobias Klauser <[email protected]>
> > ---
> > include/linux/cpumask.h | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index 5d4d07a9e1ed..1e7399fc69c0 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -996,14 +996,15 @@ cpumap_print_to_pagebuf(bool list, char *buf, const struct
> > cpumask *mask)
> > * cpumask; Typically used by bin_attribute to export cpumask bitmask
> > * ABI.
> > *
> > - * Returns the length of how many bytes have been copied.
> > + * Returns the length of how many bytes have been copied, excluding
> > + * terminating '\0'.
> > */
> > static inline ssize_t
> > cpumap_print_bitmask_to_buf(char *buf, const struct cpumask *mask,
> > loff_t off, size_t count)
> > {
> > return bitmap_print_bitmask_to_buf(buf, cpumask_bits(mask),
> > - nr_cpu_ids, off, count);
> > + nr_cpu_ids, off, count) - 1;
> > }
> >
> > /**
> > @@ -1018,7 +1019,7 @@ cpumap_print_list_to_buf(char *buf, const struct cpumask
> > *mask,
> > loff_t off, size_t count)
> > {
> > return bitmap_print_list_to_buf(buf, cpumask_bits(mask),
> > - nr_cpu_ids, off, count);
> > + nr_cpu_ids, off, count) - 1;
> > }
> >
> > #if NR_CPUS <= BITS_PER_LONG
> > --
> > 2.33.0
>
> Thanks
> Barry

2021-09-17 13:27:23

by Tobias Klauser

[permalink] [raw]
Subject: Re: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf

On 2021-09-17 at 01:19:04 +0200, Yury Norov <[email protected]> wrote:
> [CC Greg KH <[email protected]>]
>
> On Thu, Sep 16, 2021 at 10:53:39PM +0000, Song Bao Hua (Barry Song) wrote:
> >
> >
> > > -----Original Message-----
> > > From: Tobias Klauser [mailto:[email protected]]
> > > Sent: Friday, September 17, 2021 10:27 AM
> > > To: Greg Kroah-Hartman <[email protected]>; Jonathan Cameron
> > > <[email protected]>; tiantao (H) <[email protected]>; Song Bao
> > > Hua (Barry Song) <[email protected]>
> > > Cc: Andrew Morton <[email protected]>; Andy Shevchenko
> > > <[email protected]>; Yury Norov <[email protected]>; Peter
> > > Zijlstra <[email protected]>; [email protected]
> > > Subject: [PATCH] cpumask: Omit terminating null byte in
> > > cpumap_print_{list,bitmask}_to_buf
> > >
> > > The changes in the patch series [1] introduced a terminating null byte
> > > when reading from cpulist or cpumap sysfs files, for example:
> > >
> > > $ xxd /sys/devices/system/node/node0/cpulist
> > > 00000000: 302d 310a 00 0-1..
> > >
> > > Before this change, the output looked as follows:
> > >
> > > $ xxd /sys/devices/system/node/node0/cpulist
> > > 00000000: 302d 310a 0-1.
> >
> > If we don't use xxd, I don't see any actual harm of this NULL byte
> > by cat, lscpu, numactl etc. this doesn't break them at all.
>
> Barry, Tobias' script that uses xxd is userspace. Linux kernel never breaks
> userspace.

FWIW, the example using xxd was just to illustrate the issue in a
concise way for the commit message. This is breaking other userspace
programs as well. Originally, I discovered this because Kubernetes'
kubelet was crashing on a bpf-next kernel. See [1] and following
comments for more information:

[1] https://github.com/cilium/cilium/pull/17394#issuecomment-920902042

> > if we only want to make sure the output is exactly same with before
> > for every single character, this patch is right.
>
> We don't want to make the output exactly the same. The "0,1" would
> also work for the example above. But garbage characters following \0
> is a bug that should be fixed.

I think we also want to avoid the \0 itself, which is what this patch
does and is in line with previous behavior. It also looks like all other
sysfs files in that subtree expose the same content format (i.e. \n is
the last character, not \0).

Thanks,
Tobias

2021-09-19 12:27:12

by Barry Song

[permalink] [raw]
Subject: Re: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf

On Sat, Sep 18, 2021 at 1:27 AM Tobias Klauser <[email protected]> wrote:
>
> On 2021-09-17 at 01:19:04 +0200, Yury Norov <[email protected]> wrote:
> > [CC Greg KH <[email protected]>]
> >
> > On Thu, Sep 16, 2021 at 10:53:39PM +0000, Song Bao Hua (Barry Song) wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Tobias Klauser [mailto:[email protected]]
> > > > Sent: Friday, September 17, 2021 10:27 AM
> > > > To: Greg Kroah-Hartman <[email protected]>; Jonathan Cameron
> > > > <[email protected]>; tiantao (H) <[email protected]>; Song Bao
> > > > Hua (Barry Song) <[email protected]>
> > > > Cc: Andrew Morton <[email protected]>; Andy Shevchenko
> > > > <[email protected]>; Yury Norov <[email protected]>; Peter
> > > > Zijlstra <[email protected]>; [email protected]
> > > > Subject: [PATCH] cpumask: Omit terminating null byte in
> > > > cpumap_print_{list,bitmask}_to_buf
> > > >
> > > > The changes in the patch series [1] introduced a terminating null byte
> > > > when reading from cpulist or cpumap sysfs files, for example:
> > > >
> > > > $ xxd /sys/devices/system/node/node0/cpulist
> > > > 00000000: 302d 310a 00 0-1..
> > > >
> > > > Before this change, the output looked as follows:
> > > >
> > > > $ xxd /sys/devices/system/node/node0/cpulist
> > > > 00000000: 302d 310a 0-1.
> > >
> > > If we don't use xxd, I don't see any actual harm of this NULL byte
> > > by cat, lscpu, numactl etc. this doesn't break them at all.
> >
> > Barry, Tobias' script that uses xxd is userspace. Linux kernel never breaks
> > userspace.
>
> FWIW, the example using xxd was just to illustrate the issue in a
> concise way for the commit message. This is breaking other userspace
> programs as well. Originally, I discovered this because Kubernetes'
> kubelet was crashing on a bpf-next kernel. See [1] and following
> comments for more information:
>
> [1] https://github.com/cilium/cilium/pull/17394#issuecomment-920902042
>

cat, lscpu, numactl tools were tested. the above was not in the test cases.
Anyway, if some apps depend on the last character, this patch makes
sense. we need this one. sorry for missing the test case.

Acked-by: Barry Song <[email protected]>

Greg, can you please help merge this one into 5.15?

> > > if we only want to make sure the output is exactly same with before
> > > for every single character, this patch is right.
> >
> > We don't want to make the output exactly the same. The "0,1" would
> > also work for the example above. But garbage characters following \0
> > is a bug that should be fixed.
>
> I think we also want to avoid the \0 itself, which is what this patch
> does and is in line with previous behavior. It also looks like all other
> sysfs files in that subtree expose the same content format (i.e. \n is
> the last character, not \0).
>
> Thanks,
> Tobias

Thanks
barry

2021-09-19 12:37:41

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf

On Sun, Sep 19, 2021 at 07:33:52PM +1200, Barry Song wrote:
> On Sat, Sep 18, 2021 at 1:27 AM Tobias Klauser <[email protected]> wrote:
> >
> > On 2021-09-17 at 01:19:04 +0200, Yury Norov <[email protected]> wrote:
> > > [CC Greg KH <[email protected]>]
> > >
> > > On Thu, Sep 16, 2021 at 10:53:39PM +0000, Song Bao Hua (Barry Song) wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Tobias Klauser [mailto:[email protected]]
> > > > > Sent: Friday, September 17, 2021 10:27 AM
> > > > > To: Greg Kroah-Hartman <[email protected]>; Jonathan Cameron
> > > > > <[email protected]>; tiantao (H) <[email protected]>; Song Bao
> > > > > Hua (Barry Song) <[email protected]>
> > > > > Cc: Andrew Morton <[email protected]>; Andy Shevchenko
> > > > > <[email protected]>; Yury Norov <[email protected]>; Peter
> > > > > Zijlstra <[email protected]>; [email protected]
> > > > > Subject: [PATCH] cpumask: Omit terminating null byte in
> > > > > cpumap_print_{list,bitmask}_to_buf
> > > > >
> > > > > The changes in the patch series [1] introduced a terminating null byte
> > > > > when reading from cpulist or cpumap sysfs files, for example:
> > > > >
> > > > > $ xxd /sys/devices/system/node/node0/cpulist
> > > > > 00000000: 302d 310a 00 0-1..
> > > > >
> > > > > Before this change, the output looked as follows:
> > > > >
> > > > > $ xxd /sys/devices/system/node/node0/cpulist
> > > > > 00000000: 302d 310a 0-1.
> > > >
> > > > If we don't use xxd, I don't see any actual harm of this NULL byte
> > > > by cat, lscpu, numactl etc. this doesn't break them at all.
> > >
> > > Barry, Tobias' script that uses xxd is userspace. Linux kernel never breaks
> > > userspace.
> >
> > FWIW, the example using xxd was just to illustrate the issue in a
> > concise way for the commit message. This is breaking other userspace
> > programs as well. Originally, I discovered this because Kubernetes'
> > kubelet was crashing on a bpf-next kernel. See [1] and following
> > comments for more information:
> >
> > [1] https://github.com/cilium/cilium/pull/17394#issuecomment-920902042
> >
>
> cat, lscpu, numactl tools were tested. the above was not in the test cases.
> Anyway, if some apps depend on the last character, this patch makes
> sense. we need this one. sorry for missing the test case.
>
> Acked-by: Barry Song <[email protected]>
>
> Greg, can you please help merge this one into 5.15?

Yes, will apply it to my tree soon.

thanks,

greg k-h

2021-09-30 10:32:02

by Antti Kervinen

[permalink] [raw]
Subject: Re: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf


An original function, bitmap_print_to_pagebuf() in lib/bitmap.c,
returns the number of printed characters, excluding terminating null.

Commit 1fae5629, a cause of this regression, introduced new functions
to lib/bitmap.c:

- bitmap_print_to_buf()
(return value doc missing)

- bitmap_print_bitmask_to_buf()
(return value doc not explicit about terminating null, but
can be considered misleading)

- bitmap_print_list_to_buf()
(the same as above)

Unlike the original function, the return value of new functions
include the terminating null.

As this behavior is clearly opposite to the original function, and
functions that print to buffers in general, I would suggest fixing
this problem by alignign these new functions with the original one:
excluding the terminating null. And documenting this behavior
unambiguously.

The suggested change to cpumask_print_{bitmask,list}_to_buf()
functions decrements possible errors (like -ENOMEM) returned by
bitmap_print_to_buf(). This must not happen.

2021-09-30 10:48:23

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf

On Thu, Sep 30, 2021 at 01:30:10PM +0300, Antti Kervinen wrote:
>
> An original function, bitmap_print_to_pagebuf() in lib/bitmap.c,
> returns the number of printed characters, excluding terminating null.
>
> Commit 1fae5629, a cause of this regression, introduced new functions
> to lib/bitmap.c:
>
> - bitmap_print_to_buf()
> (return value doc missing)
>
> - bitmap_print_bitmask_to_buf()
> (return value doc not explicit about terminating null, but
> can be considered misleading)
>
> - bitmap_print_list_to_buf()
> (the same as above)
>
> Unlike the original function, the return value of new functions
> include the terminating null.
>
> As this behavior is clearly opposite to the original function, and
> functions that print to buffers in general, I would suggest fixing
> this problem by alignign these new functions with the original one:
> excluding the terminating null. And documenting this behavior
> unambiguously.
>
> The suggested change to cpumask_print_{bitmask,list}_to_buf()
> functions decrements possible errors (like -ENOMEM) returned by
> bitmap_print_to_buf(). This must not happen.

I already pointed you at
https://lore.kernel.org/r/[email protected]
a few hours ago.

Why not test the patch there (and in linux-next) and let us know if it
resolves the issue you see or not.

thanks,

greg k-h

2021-09-30 12:24:27

by Barry Song

[permalink] [raw]
Subject: Re: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf

On Thu, Sep 30, 2021 at 11:31 PM Antti Kervinen
<[email protected]> wrote:
>
>
> An original function, bitmap_print_to_pagebuf() in lib/bitmap.c,
> returns the number of printed characters, excluding terminating null.
>

a patch has been in
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=c86a2d9058c5a4a05d20ef89e699b7a6b2c89da6

I guess it is on the way to linus' tree.

> Commit 1fae5629, a cause of this regression, introduced new functions
> to lib/bitmap.c:
>
> - bitmap_print_to_buf()
> (return value doc missing)
>
> - bitmap_print_bitmask_to_buf()
> (return value doc not explicit about terminating null, but
> can be considered misleading)
>
> - bitmap_print_list_to_buf()
> (the same as above)
>
> Unlike the original function, the return value of new functions
> include the terminating null.
>
> As this behavior is clearly opposite to the original function, and
> functions that print to buffers in general, I would suggest fixing
> this problem by alignign these new functions with the original one:
> excluding the terminating null. And documenting this behavior
> unambiguously.
>
> The suggested change to cpumask_print_{bitmask,list}_to_buf()
> functions decrements possible errors (like -ENOMEM) returned by
> bitmap_print_to_buf(). This must not happen.

Thanks
barry

2021-09-30 13:10:18

by Antti Kervinen

[permalink] [raw]
Subject: Re: [PATCH] cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf


> Why not test the patch there (and in linux-next) and let us know if
> it resolves the issue you see or not.

Yes, I conform that the patch fixes this issue in my userspace. Too
bad that I was obviously late with my code review notes.

Thanks!

Antti