2023-12-18 18:45:00

by Tanzir Hasan

[permalink] [raw]
Subject: [PATCH v3 0/2] shrink lib/string.i via IWYU

This patch series changes the include list of string.c to minimize
the preprocessing size. The patch series intends to remove REPEAT_BYE
from kernel.h and move it into its own header file because
word-at-a-time.h has an implicit dependancy on it but it is declared
in kernel.h which is bloated.

---

---
Changes in v3:
- Moved REPEAT_BYTE out of kernel.h and into wordpart.h.
- Included wordpart.h where REPEAT_BYTE was necessary.
- Link to v2: https://lore.kernel.org/r/[email protected]

Changes in v2:
- Transformed into a patch series
- Changed asm inclusions to linux inclusions
- added a patch to sh
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Tanzir Hasan (2):
kernel.h: removed REPEAT_BYTE from kernel.h
lib/string: shrink lib/string.i via IWYU

arch/arm/include/asm/word-at-a-time.h | 1 +
arch/arm64/include/asm/word-at-a-time.h | 1 +
arch/powerpc/include/asm/word-at-a-time.h | 1 +
arch/riscv/include/asm/word-at-a-time.h | 1 +
arch/s390/include/asm/word-at-a-time.h | 1 +
arch/sh/include/asm/word-at-a-time.h | 2 ++
arch/x86/include/asm/word-at-a-time.h | 1 +
fs/namei.c | 2 +-
include/asm-generic/word-at-a-time.h | 1 +
include/linux/kernel.h | 7 -------
include/linux/wordpart.h | 17 +++++++++++++++++
lib/string.c | 14 +++++++-------
12 files changed, 34 insertions(+), 15 deletions(-)
---
base-commit: ceb6a6f023fd3e8b07761ed900352ef574010bcb
change-id: 20231204-libstringheader-e238e2af5eec

Best regards,
--
Tanzir Hasan <[email protected]>



2023-12-18 18:45:09

by Tanzir Hasan

[permalink] [raw]
Subject: [PATCH v3 1/2] kernel.h: removed REPEAT_BYTE from kernel.h

This patch creates wordpart.h and includes it in asm/word-at-a-time.h
for the all architectures. WORD_AT_A_TIME_CONSTANTS depends on kernel.h
because of REPEAT_BYTE. Moving this to another header and including it
where necessary allows us to not include the bloated kernel.h. Making
this implicit dependancy on REPEAT_BYTE explicit allows for later
improvements in the lib/string.c inclusion list.

Suggested-by: Al Viro <[email protected]>
Suggested-by: Andy Shevchenko <[email protected]>
Signed-off-by: Tanzir Hasan <[email protected]>
---
arch/arm/include/asm/word-at-a-time.h | 1 +
arch/arm64/include/asm/word-at-a-time.h | 1 +
arch/powerpc/include/asm/word-at-a-time.h | 1 +
arch/riscv/include/asm/word-at-a-time.h | 1 +
arch/s390/include/asm/word-at-a-time.h | 1 +
arch/sh/include/asm/word-at-a-time.h | 2 ++
arch/x86/include/asm/word-at-a-time.h | 1 +
fs/namei.c | 2 +-
include/asm-generic/word-at-a-time.h | 1 +
include/linux/kernel.h | 7 -------
include/linux/wordpart.h | 17 +++++++++++++++++
11 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/word-at-a-time.h b/arch/arm/include/asm/word-at-a-time.h
index 352ab213520d..aea6bfe259f5 100644
--- a/arch/arm/include/asm/word-at-a-time.h
+++ b/arch/arm/include/asm/word-at-a-time.h
@@ -9,6 +9,7 @@
* Heavily based on the x86 algorithm.
*/
#include <linux/kernel.h>
+#include <linux/wordpart.h>

struct word_at_a_time {
const unsigned long one_bits, high_bits;
diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h
index f3b151ed0d7a..cf8c28a3bba9 100644
--- a/arch/arm64/include/asm/word-at-a-time.h
+++ b/arch/arm64/include/asm/word-at-a-time.h
@@ -10,6 +10,7 @@
#ifndef __AARCH64EB__

#include <linux/kernel.h>
+#include <linux/wordpart.h>

struct word_at_a_time {
const unsigned long one_bits, high_bits;
diff --git a/arch/powerpc/include/asm/word-at-a-time.h b/arch/powerpc/include/asm/word-at-a-time.h
index 30a12d208687..032a77d324e6 100644
--- a/arch/powerpc/include/asm/word-at-a-time.h
+++ b/arch/powerpc/include/asm/word-at-a-time.h
@@ -6,6 +6,7 @@
*/

#include <linux/kernel.h>
+#include <linux/wordpart.h>
#include <asm/asm-compat.h>
#include <asm/extable.h>

diff --git a/arch/riscv/include/asm/word-at-a-time.h b/arch/riscv/include/asm/word-at-a-time.h
index 7c086ac6ecd4..9b7ea4c1a616 100644
--- a/arch/riscv/include/asm/word-at-a-time.h
+++ b/arch/riscv/include/asm/word-at-a-time.h
@@ -10,6 +10,7 @@


#include <linux/kernel.h>
+#include <linux/wordpart.h>

struct word_at_a_time {
const unsigned long one_bits, high_bits;
diff --git a/arch/s390/include/asm/word-at-a-time.h b/arch/s390/include/asm/word-at-a-time.h
index 2579f1694b82..5af1d267b4d7 100644
--- a/arch/s390/include/asm/word-at-a-time.h
+++ b/arch/s390/include/asm/word-at-a-time.h
@@ -3,6 +3,7 @@
#define _ASM_WORD_AT_A_TIME_H

#include <linux/kernel.h>
+#include <linux/wordpart.h>
#include <asm/asm-extable.h>
#include <asm/bitsperlong.h>

diff --git a/arch/sh/include/asm/word-at-a-time.h b/arch/sh/include/asm/word-at-a-time.h
index 4aa398455b94..3a11158f0143 100644
--- a/arch/sh/include/asm/word-at-a-time.h
+++ b/arch/sh/include/asm/word-at-a-time.h
@@ -5,6 +5,8 @@
#ifdef CONFIG_CPU_BIG_ENDIAN
# include <asm-generic/word-at-a-time.h>
#else
+#include <linux/kernel.h>
+#include <linux/wordpart.h>
/*
* Little-endian version cribbed from x86.
*/
diff --git a/arch/x86/include/asm/word-at-a-time.h b/arch/x86/include/asm/word-at-a-time.h
index 46b4f1f7f354..c002c864a63e 100644
--- a/arch/x86/include/asm/word-at-a-time.h
+++ b/arch/x86/include/asm/word-at-a-time.h
@@ -3,6 +3,7 @@
#define _ASM_WORD_AT_A_TIME_H

#include <linux/kernel.h>
+#include <linux/wordpart.h>

/*
* This is largely generic for little-endian machines, but the
diff --git a/fs/namei.c b/fs/namei.c
index 71c13b2990b4..03db8ca3f394 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -17,7 +17,7 @@

#include <linux/init.h>
#include <linux/export.h>
-#include <linux/kernel.h>
+#include <linux/wordpart.h>
#include <linux/slab.h>
#include <linux/fs.h>
#include <linux/filelock.h>
diff --git a/include/asm-generic/word-at-a-time.h b/include/asm-generic/word-at-a-time.h
index 95a1d214108a..fc007fda0b2e 100644
--- a/include/asm-generic/word-at-a-time.h
+++ b/include/asm-generic/word-at-a-time.h
@@ -3,6 +3,7 @@
#define _ASM_WORD_AT_A_TIME_H

#include <linux/kernel.h>
+#include <linux/wordpart.h>
#include <asm/byteorder.h>

#ifdef __BIG_ENDIAN
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index d9ad21058eed..162660af5b7d 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -39,13 +39,6 @@

#define STACK_MAGIC 0xdeadbeef

-/**
- * REPEAT_BYTE - repeat the value @x multiple times as an unsigned long value
- * @x: value to repeat
- *
- * NOTE: @x is not checked for > 0xff; larger values produce odd results.
- */
-#define REPEAT_BYTE(x) ((~0ul / 0xff) * (x))

/* generic data direction definitions */
#define READ 0
diff --git a/include/linux/wordpart.h b/include/linux/wordpart.h
new file mode 100644
index 000000000000..6a5ed5d54ba2
--- /dev/null
+++ b/include/linux/wordpart.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 Google LLC <[email protected]>
+ */
+
+#ifndef _LINUX_WORDPART_H
+#define _LINUX_WORDPART_H
+/**
+ * REPEAT_BYTE - repeat the value @x multiple times as an unsigned long value
+ * @x: value to repeat
+ *
+ * NOTE: @x is not checked for > 0xff; larger values produce odd results.
+ */
+#define REPEAT_BYTE(x) ((~0ul / 0xff) * (x))
+
+#endif // _LINUX_WORDPART_H
+

--
2.43.0.472.g3155946c3a-goog


2023-12-18 18:47:02

by Tanzir Hasan

[permalink] [raw]
Subject: [PATCH v3 2/2] lib/string: shrink lib/string.i via IWYU

This diff uses an open source tool include-what-you-use (IWYU) to modify
the include list changing indirect includes to direct includes.
IWYU is implemented using the IWYUScripts github repository which is a tool that is
currently undergoing development. These changes seek to improve build times.

This change to lib/string.c resulted in a preprocessed size of
lib/string.i from 26371 lines to 5263 lines (-80%) for the x86
defconfig.

Link: https://github.com/ClangBuiltLinux/IWYUScripts
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Tanzir Hasan <[email protected]>
---
lib/string.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index be26623953d2..6e24a4967dd7 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -16,18 +16,18 @@

#define __NO_FORTIFY
#include <linux/types.h>
-#include <linux/string.h>
-#include <linux/ctype.h>
-#include <linux/kernel.h>
-#include <linux/export.h>
+#include <linux/bits.h>
#include <linux/bug.h>
#include <linux/errno.h>
-#include <linux/slab.h>
-
+#include <linux/linkage.h>
+#include <linux/stddef.h>
+#include <linux/string.h>
+#include <linux/ctype.h>
#include <asm/unaligned.h>
-#include <asm/byteorder.h>
+#include <asm/rwonce.h>
#include <asm/word-at-a-time.h>
#include <asm/page.h>
+#include <vdso/limits.h>

#ifndef __HAVE_ARCH_STRNCASECMP
/**

--
2.43.0.472.g3155946c3a-goog


2023-12-18 20:26:55

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] kernel.h: removed REPEAT_BYTE from kernel.h

On Mon, Dec 18, 2023 at 06:44:47PM +0000, [email protected] wrote:
> This patch creates wordpart.h and includes it in asm/word-at-a-time.h
> for the all architectures. WORD_AT_A_TIME_CONSTANTS depends on kernel.h
> because of REPEAT_BYTE. Moving this to another header and including it
> where necessary allows us to not include the bloated kernel.h. Making
> this implicit dependancy on REPEAT_BYTE explicit allows for later
> improvements in the lib/string.c inclusion list.
>
> Suggested-by: Al Viro <[email protected]>
> Suggested-by: Andy Shevchenko <[email protected]>
> Signed-off-by: Tanzir Hasan <[email protected]>

Note, your email name is not set so this will not work properly when it
is committed to the tree :(

thanks,

greg k-h

2023-12-19 15:51:46

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] kernel.h: removed REPEAT_BYTE from kernel.h

On Mon, Dec 18, 2023 at 06:44:47PM +0000, [email protected] wrote:
> This patch creates wordpart.h and includes it in asm/word-at-a-time.h
> for the all architectures. WORD_AT_A_TIME_CONSTANTS depends on kernel.h
> because of REPEAT_BYTE. Moving this to another header and including it
> where necessary allows us to not include the bloated kernel.h. Making
> this implicit dependancy on REPEAT_BYTE explicit allows for later
> improvements in the lib/string.c inclusion list.

...

> #include <linux/kernel.h>
> +#include <linux/wordpart.h>

Expected is to get rid of kernel.h completely.
For instance, in arch/arm/include/asm/word-at-a-time.h case this
should become:

#include <linux/bitops.h>
#include <linux/wordpart.h>

(Briefly looking the same is for include/asm-generic/word-at-a-time.h)

And so on...

...

> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -39,13 +39,6 @@

> -/**
> - * REPEAT_BYTE - repeat the value @x multiple times as an unsigned long value
> - * @x: value to repeat
> - *
> - * NOTE: @x is not checked for > 0xff; larger values produce odd results.
> - */
> -#define REPEAT_BYTE(x) ((~0ul / 0xff) * (x))

Okay, we don't back include because we changing all the users at once.

...

> +++ b/include/linux/wordpart.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0 */

> +/*
> + * Copyright (C) 2023 Google LLC <[email protected]>
> + */

Can be on a single line.

Is it? Personally I skip this from the split headers, but I'm not your lawyer.

--
With Best Regards,
Andy Shevchenko



2023-12-19 15:55:23

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] lib/string: shrink lib/string.i via IWYU

On Mon, Dec 18, 2023 at 06:44:48PM +0000, [email protected] wrote:
> This diff uses an open source tool include-what-you-use (IWYU) to modify
> the include list changing indirect includes to direct includes.
> IWYU is implemented using the IWYUScripts github repository which is a tool that is
> currently undergoing development. These changes seek to improve build times.
>
> This change to lib/string.c resulted in a preprocessed size of
> lib/string.i from 26371 lines to 5263 lines (-80%) for the x86
> defconfig.

...

> #define __NO_FORTIFY
> #include <linux/types.h>
> -#include <linux/string.h>
> -#include <linux/ctype.h>
> -#include <linux/kernel.h>
> -#include <linux/export.h>
> +#include <linux/bits.h>
> #include <linux/bug.h>
> #include <linux/errno.h>
> -#include <linux/slab.h>

> -

Why this blank is removed?

> +#include <linux/linkage.h>
> +#include <linux/stddef.h>
> +#include <linux/string.h>
> +#include <linux/ctype.h>
> #include <asm/unaligned.h>
> -#include <asm/byteorder.h>
> +#include <asm/rwonce.h>
> #include <asm/word-at-a-time.h>
> #include <asm/page.h>

Sort this group alphabetically as well.

> +#include <vdso/limits.h>

Just use linux/limits.h.

VDSO is a very special UAPI case. So it's even stricter rule
than for asm/ for using anything from there.

Expected result:

#include <linux/bits.h>
#include <linux/bug.h>
#include <linux/ctype.h>
#include <linux/errno.h>
#include <linux/limits.h>
#include <linux/linkage.h>
#include <linux/stddef.h>
#include <linux/string.h>
#include <linux/types.h>

#include <asm/page.h>
#include <asm/rwonce.h>
#include <asm/unaligned.h>
#include <asm/word-at-a-time.h>

--
With Best Regards,
Andy Shevchenko



2023-12-19 16:47:19

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] lib/string: shrink lib/string.i via IWYU

On Tue, Dec 19, 2023 at 7:55 AM Andy Shevchenko <[email protected]> wrote:
>
> On Mon, Dec 18, 2023 at 06:44:48PM +0000, [email protected] wrote:
> > This diff uses an open source tool include-what-you-use (IWYU) to modify
> > the include list changing indirect includes to direct includes.
> > IWYU is implemented using the IWYUScripts github repository which is a tool that is
> > currently undergoing development. These changes seek to improve build times.
> >
> > This change to lib/string.c resulted in a preprocessed size of
> > lib/string.i from 26371 lines to 5263 lines (-80%) for the x86
> > defconfig.
>
> ...
>
> > #define __NO_FORTIFY
> > #include <linux/types.h>
> > -#include <linux/string.h>
> > -#include <linux/ctype.h>
> > -#include <linux/kernel.h>
> > -#include <linux/export.h>
> > +#include <linux/bits.h>
> > #include <linux/bug.h>
> > #include <linux/errno.h>
> > -#include <linux/slab.h>
>
> > -
>
> Why this blank is removed?

The automation isn't aware of any convention around having blank lines
separate linux/* vs asm/*. Is that a convention we have throughout
the kernel, or just this file?

If we rerun the automation on this file after Tanzir's patch here, we
get no further changes. If we manually touch up the results, then
rerun the automation, it will undo the manual touch ups.

I don't mind saying "you might have to manually touch up the results
of the automation to comply with <link to documentation on stated
kernel policy/style guide>" but on the other hand I also think it
would be nice if other folks run the automation so that they don't get
additional changes. I'd like to avoid drive-by patches that just undo
any manual touch ups.

>
> > +#include <linux/linkage.h>
> > +#include <linux/stddef.h>
> > +#include <linux/string.h>
> > +#include <linux/ctype.h>
> > #include <asm/unaligned.h>
> > -#include <asm/byteorder.h>
> > +#include <asm/rwonce.h>
> > #include <asm/word-at-a-time.h>
> > #include <asm/page.h>
>
> Sort this group alphabetically as well.

By default the automation sorts the result. I asked Tanzir to
explicitly disable that; otherwise the resulting diffstat is hard to
tell precisely what was removed/added vs simply moved.

If he kept the default behavior, I highly suspect the feedback would
have been along the lines of "please don't sort the result; I can't
tell what moved vs was added or removed."

Perhaps we should add another commit on the series that manually sorts
the results _after_ the automation?

Do we have anything in Documentation/process/coding-style.rst about
sorting headers? There's a blip about clang-format sorting them, but
we don't have strong guidance along the lines of "you ought to sort
your includes (when you don't have special cases like x-macros)."

>
> > +#include <vdso/limits.h>
>
> Just use linux/limits.h.
>
> VDSO is a very special UAPI case. So it's even stricter rule
> than for asm/ for using anything from there.

We can add a special rule for this, Tanzir, please add. And remember
to re-measure the results of that change for this patch's commit
message.

>
> Expected result:
>
> #include <linux/bits.h>
> #include <linux/bug.h>
> #include <linux/ctype.h>
> #include <linux/errno.h>
> #include <linux/limits.h>
> #include <linux/linkage.h>
> #include <linux/stddef.h>
> #include <linux/string.h>
> #include <linux/types.h>
>
> #include <asm/page.h>
> #include <asm/rwonce.h>
> #include <asm/unaligned.h>
> #include <asm/word-at-a-time.h>
>
> --
> With Best Regards,
> Andy Shevchenko
>
>


--
Thanks,
~Nick Desaulniers

2023-12-21 16:38:24

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] lib/string: shrink lib/string.i via IWYU

On Tue, Dec 19, 2023 at 08:43:01AM -0800, Nick Desaulniers wrote:
> On Tue, Dec 19, 2023 at 7:55 AM Andy Shevchenko <[email protected]> wrote:
> > On Mon, Dec 18, 2023 at 06:44:48PM +0000, [email protected] wrote:

...

> > > #define __NO_FORTIFY
> > > #include <linux/types.h>
> > > -#include <linux/string.h>
> > > -#include <linux/ctype.h>
> > > -#include <linux/kernel.h>
> > > -#include <linux/export.h>
> > > +#include <linux/bits.h>
> > > #include <linux/bug.h>
> > > #include <linux/errno.h>
> > > -#include <linux/slab.h>
> >
> > > -
> >
> > Why this blank is removed?
>
> The automation isn't aware of any convention around having blank lines
> separate linux/* vs asm/*. Is that a convention we have throughout
> the kernel, or just this file?

There is no documented conventions like this. Probably one has to add some,
let's say, recommendations.

This way (alphabetical ordering and grouping) it is easier to maintain and
in long term makes easier (less possible conflicts) when backporting.

> If we rerun the automation on this file after Tanzir's patch here, we
> get no further changes. If we manually touch up the results, then
> rerun the automation, it will undo the manual touch ups.
>
> I don't mind saying "you might have to manually touch up the results
> of the automation to comply with <link to documentation on stated
> kernel policy/style guide>" but on the other hand I also think it
> would be nice if other folks run the automation so that they don't get
> additional changes. I'd like to avoid drive-by patches that just undo
> any manual touch ups.

I agree with this, but we have many maintainers and reviewers and some
ask one way, the others another way and so on... Without documentation
one can use a common sense and rationale behind it.

I don't really care about blank lines (however grouping is still done
as of today in this file), but I do care about ordering and grouping.

...

> > > +#include <linux/linkage.h>
> > > +#include <linux/stddef.h>
> > > +#include <linux/string.h>
> > > +#include <linux/ctype.h>
> > > #include <asm/unaligned.h>
> > > -#include <asm/byteorder.h>
> > > +#include <asm/rwonce.h>
> > > #include <asm/word-at-a-time.h>
> > > #include <asm/page.h>
> >
> > Sort this group alphabetically as well.
>
> By default the automation sorts the result. I asked Tanzir to
> explicitly disable that; otherwise the resulting diffstat is hard to
> tell precisely what was removed/added vs simply moved.
>
> If he kept the default behavior, I highly suspect the feedback would
> have been along the lines of "please don't sort the result; I can't
> tell what moved vs was added or removed."

Do you have such examples IRL?

In any case, how would you satisfy the maintainer's request which asks for
sorting?

> Perhaps we should add another commit on the series that manually sorts
> the results _after_ the automation?

I have no objections for this way (doing it in a two separate changes).
With the proposed you can satisfy any maintainer basically. Ones who
do not want sorting, may skip that patch, indeed.

> Do we have anything in Documentation/process/coding-style.rst about
> sorting headers? There's a blip about clang-format sorting them, but
> we don't have strong guidance along the lines of "you ought to sort
> your includes (when you don't have special cases like x-macros)."

Actually there is a rising trend to follow clang-format.
I have no idea what it does, but my rationale is explained above.

...

> > > +#include <vdso/limits.h>
> >
> > Just use linux/limits.h.
> >
> > VDSO is a very special UAPI case. So it's even stricter rule
> > than for asm/ for using anything from there.
>
> We can add a special rule for this, Tanzir, please add. And remember
> to re-measure the results of that change for this patch's commit
> message.

Yes, please do.

...

> > Expected result:
> >
> > #include <linux/bits.h>
> > #include <linux/bug.h>
> > #include <linux/ctype.h>
> > #include <linux/errno.h>
> > #include <linux/limits.h>
> > #include <linux/linkage.h>
> > #include <linux/stddef.h>
> > #include <linux/string.h>
> > #include <linux/types.h>
> >
> > #include <asm/page.h>
> > #include <asm/rwonce.h>
> > #include <asm/unaligned.h>
> > #include <asm/word-at-a-time.h>

So, let's have above with one or two patches.

--
With Best Regards,
Andy Shevchenko