2022-06-07 07:44:45

by Kevin Locke

[permalink] [raw]
Subject: [PATCH] kbuild: avoid regex RS for POSIX awk

In 22f26f21774f8 awk was added to deduplicate *.mod files. The awk
invocation passes -v RS='( |\n)' to match a space or newline character
as the record separator. Unfortunately, POSIX states[1]

> If RS contains more than one character, the results are unspecified.

Some implementations (such as the One True Awk[2] used by the BSDs) do
not treat RS as a regular expression. When awk does not support regex
RS, build failures such as the following are produced (first error using
allmodconfig):

CC [M] arch/x86/events/intel/uncore.o
CC [M] arch/x86/events/intel/uncore_nhmex.o
CC [M] arch/x86/events/intel/uncore_snb.o
CC [M] arch/x86/events/intel/uncore_snbep.o
CC [M] arch/x86/events/intel/uncore_discovery.o
LD [M] arch/x86/events/intel/intel-uncore.o
ld: cannot find uncore_nhmex.o: No such file or directory
ld: cannot find uncore_snb.o: No such file or directory
ld: cannot find uncore_snbep.o: No such file or directory
ld: cannot find uncore_discovery.o: No such file or directory
make[3]: *** [scripts/Makefile.build:422: arch/x86/events/intel/intel-uncore.o] Error 1
make[2]: *** [scripts/Makefile.build:487: arch/x86/events/intel] Error 2
make[1]: *** [scripts/Makefile.build:487: arch/x86/events] Error 2
make: *** [Makefile:1839: arch/x86] Error 2

To avoid this, use printf(1) to produce a newline between each object
path, instead of the space produced by echo(1), so that the default RS
can be used by awk.

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
[2]: https://github.com/onetrueawk/awk

Fixes: 22f26f21774f ("kbuild: get rid of duplication in *.mod files")
Signed-off-by: Kevin Locke <[email protected]>
---
scripts/Makefile.build | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 1f01ac65c0cd..cac070aee791 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -251,8 +251,8 @@ $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE

# To make this rule robust against "Argument list too long" error,
# ensure to add $(obj)/ prefix by a shell command.
-cmd_mod = echo $(call real-search, $*.o, .o, -objs -y -m) | \
- $(AWK) -v RS='( |\n)' '!x[$$0]++ { print("$(obj)/"$$0) }' > $@
+cmd_mod = printf '%s\n' $(call real-search, $*.o, .o, -objs -y -m) | \
+ $(AWK) '!x[$$0]++ { print("$(obj)/"$$0) }' > $@

$(obj)/%.mod: FORCE
$(call if_changed,mod)
--
2.35.1


2022-06-08 04:02:24

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] kbuild: avoid regex RS for POSIX awk

On Tue, Jun 7, 2022 at 11:43 AM Kevin Locke <[email protected]> wrote:
>
> In 22f26f21774f8 awk was added to deduplicate *.mod files. The awk
> invocation passes -v RS='( |\n)' to match a space or newline character
> as the record separator. Unfortunately, POSIX states[1]
>
> > If RS contains more than one character, the results are unspecified.
>
> Some implementations (such as the One True Awk[2] used by the BSDs) do
> not treat RS as a regular expression. When awk does not support regex
> RS, build failures such as the following are produced (first error using
> allmodconfig):
>
> CC [M] arch/x86/events/intel/uncore.o
> CC [M] arch/x86/events/intel/uncore_nhmex.o
> CC [M] arch/x86/events/intel/uncore_snb.o
> CC [M] arch/x86/events/intel/uncore_snbep.o
> CC [M] arch/x86/events/intel/uncore_discovery.o
> LD [M] arch/x86/events/intel/intel-uncore.o
> ld: cannot find uncore_nhmex.o: No such file or directory
> ld: cannot find uncore_snb.o: No such file or directory
> ld: cannot find uncore_snbep.o: No such file or directory
> ld: cannot find uncore_discovery.o: No such file or directory
> make[3]: *** [scripts/Makefile.build:422: arch/x86/events/intel/intel-uncore.o] Error 1
> make[2]: *** [scripts/Makefile.build:487: arch/x86/events/intel] Error 2
> make[1]: *** [scripts/Makefile.build:487: arch/x86/events] Error 2
> make: *** [Makefile:1839: arch/x86] Error 2
>
> To avoid this, use printf(1) to produce a newline between each object
> path, instead of the space produced by echo(1), so that the default RS
> can be used by awk.
>
> [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
> [2]: https://github.com/onetrueawk/awk
>
> Fixes: 22f26f21774f ("kbuild: get rid of duplication in *.mod files")
> Signed-off-by: Kevin Locke <[email protected]>
> ---

Portable and clean solution!

Applied to linux-kbuild/fixes. Thanks.





--
Best Regards
Masahiro Yamada

2022-06-08 08:56:46

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] kbuild: avoid regex RS for POSIX awk

From: Kevin Locke
> Sent: 07 June 2022 03:43
>
> In 22f26f21774f8 awk was added to deduplicate *.mod files.

Can't this be done with gmake's $(sort) function?

$(sort list)

Sorts the words of list in lexical order, removing duplicate words.
The output is a list of words separated by single spaces.

...
> # To make this rule robust against "Argument list too long" error,
> # ensure to add $(obj)/ prefix by a shell command.
> -cmd_mod = echo $(call real-search, $*.o, .o, -objs -y -m) | \
> - $(AWK) -v RS='( |\n)' '!x[$$0]++ { print("$(obj)/"$$0) }' > $@
> +cmd_mod = printf '%s\n' $(call real-search, $*.o, .o, -objs -y -m) | \
> + $(AWK) '!x[$$0]++ { print("$(obj)/"$$0) }' > $@

I think the above only works because 'printf' is (usually) a
shell builtin - so the kernel's argv[] limit doesn't apply.
So the comment isn't really right.

But I think:

cmd_mod = $(addprefix $(obj)/,$(sort $(call real-search, $*.o, .o, -objs -y -m))) >$@

will have the required effect.
Without forking and execing multiple processes.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-06-08 11:03:03

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH] kbuild: avoid regex RS for POSIX awk

On Wed, Jun 8, 2022 at 4:43 PM David Laight <[email protected]> wrote:
>
> From: Kevin Locke
> > Sent: 07 June 2022 03:43
> >
> > In 22f26f21774f8 awk was added to deduplicate *.mod files.
>
> Can't this be done with gmake's $(sort) function?
>
> $(sort list)
>
> Sorts the words of list in lexical order, removing duplicate words.
> The output is a list of words separated by single spaces.


$(sort ...) does two things,
- sort the list alphabetically
- deduplicate the elements in the list

I want to do only deduplication
without changing the order.




>
> ...
> > # To make this rule robust against "Argument list too long" error,
> > # ensure to add $(obj)/ prefix by a shell command.
> > -cmd_mod = echo $(call real-search, $*.o, .o, -objs -y -m) | \
> > - $(AWK) -v RS='( |\n)' '!x[$$0]++ { print("$(obj)/"$$0) }' > $@
> > +cmd_mod = printf '%s\n' $(call real-search, $*.o, .o, -objs -y -m) | \
> > + $(AWK) '!x[$$0]++ { print("$(obj)/"$$0) }' > $@
>
> I think the above only works because 'printf' is (usually) a
> shell builtin - so the kernel's argv[] limit doesn't apply.
> So the comment isn't really right.

Is there any difference if 'printf' were not built-in?



Right, for bash and dash, yes, 'printf' is built-in,
and we do not need to be worried about
"Argument list too long", but
I am not sure if we are able to cover all the systems.



> But I think:
>
> cmd_mod = $(addprefix $(obj)/,$(sort $(call real-search, $*.o, .o, -objs -y -m))) >$@
>
> will have the required effect.


I think 'echo' is missing here.
As I noted above, I do not want to change the order.



> Without forking and execing multiple processes.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>


--
Best Regards
Masahiro Yamada

2022-06-08 19:03:21

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] kbuild: avoid regex RS for POSIX awk

On Tue, Jun 7, 2022 at 7:31 AM Kevin Locke <[email protected]> wrote:
>
> In 22f26f21774f8 awk was added to deduplicate *.mod files. The awk
> invocation passes -v RS='( |\n)' to match a space or newline character
> as the record separator. Unfortunately, POSIX states[1]
>
> > If RS contains more than one character, the results are unspecified.
>
> Some implementations (such as the One True Awk[2] used by the BSDs) do
> not treat RS as a regular expression. When awk does not support regex
> RS, build failures such as the following are produced (first error using
> allmodconfig):
>
> CC [M] arch/x86/events/intel/uncore.o
> CC [M] arch/x86/events/intel/uncore_nhmex.o
> CC [M] arch/x86/events/intel/uncore_snb.o
> CC [M] arch/x86/events/intel/uncore_snbep.o
> CC [M] arch/x86/events/intel/uncore_discovery.o
> LD [M] arch/x86/events/intel/intel-uncore.o
> ld: cannot find uncore_nhmex.o: No such file or directory
> ld: cannot find uncore_snb.o: No such file or directory
> ld: cannot find uncore_snbep.o: No such file or directory
> ld: cannot find uncore_discovery.o: No such file or directory
> make[3]: *** [scripts/Makefile.build:422: arch/x86/events/intel/intel-uncore.o] Error 1
> make[2]: *** [scripts/Makefile.build:487: arch/x86/events/intel] Error 2
> make[1]: *** [scripts/Makefile.build:487: arch/x86/events] Error 2
> make: *** [Makefile:1839: arch/x86] Error 2
>
> To avoid this, use printf(1) to produce a newline between each object
> path, instead of the space produced by echo(1), so that the default RS
> can be used by awk.
>
> [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
> [2]: https://github.com/onetrueawk/awk
>
> Fixes: 22f26f21774f ("kbuild: get rid of duplication in *.mod files")
> Signed-off-by: Kevin Locke <[email protected]>

Tested-by: Sedat Dilek <[email protected]> # LLVM-14 (x86-64)

-Sedat-

> ---
> scripts/Makefile.build | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index 1f01ac65c0cd..cac070aee791 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -251,8 +251,8 @@ $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
>
> # To make this rule robust against "Argument list too long" error,
> # ensure to add $(obj)/ prefix by a shell command.
> -cmd_mod = echo $(call real-search, $*.o, .o, -objs -y -m) | \
> - $(AWK) -v RS='( |\n)' '!x[$$0]++ { print("$(obj)/"$$0) }' > $@
> +cmd_mod = printf '%s\n' $(call real-search, $*.o, .o, -objs -y -m) | \
> + $(AWK) '!x[$$0]++ { print("$(obj)/"$$0) }' > $@
>
> $(obj)/%.mod: FORCE
> $(call if_changed,mod)
> --
> 2.35.1
>