2021-06-17 16:03:25

by Naresh Kamboju

[permalink] [raw]
Subject: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

Linux next 20210617 tag following x86_64 builds failed with clang-10
and clang-11.
Regressions found on x86_64:

- build/clang-11-tinyconfig
- build/clang-11-allnoconfig
- build/clang-10-tinyconfig
- build/clang-10-allnoconfig
- build/clang-11-x86_64_defconfig
- build/clang-10-defconfig

We are running git bisect to identify the bad commit.

Build log:
------------
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool:
eb_relocate_parse_slow()+0x466: stack state mismatch: cfa1=4+120
cfa2=-1+0
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool:
eb_copy_relocations()+0x1e0: stack state mismatch: cfa1=4+104
cfa2=-1+0
x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry':
mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'
make[1]: *** [/builds/linux/Makefile:1252: vmlinux] Error 1
make[1]: Target '__all' not remade because of errors.
make: *** [Makefile:222: __sub-make] Error 2
make: Target '__all' not remade because of errors.
make --silent --keep-going --jobs=8
O=/home/tuxbuild/.cache/tuxmake/builds/current ARCH=x86_64
CROSS_COMPILE=x86_64-linux-gnu- 'HOSTCC=sccache clang' 'CC=sccache
clang' headers_install
INSTALL_HDR_PATH=/home/tuxbuild/.cache/tuxmake/builds/current/install_hdr/
tar caf /home/tuxbuild/.cache/tuxmake/builds/current/headers.tar.xz -C
/home/tuxbuild/.cache/tuxmake/builds/current/install_hdr .

ref:
https://builds.tuxbuild.com/1u4ZKFTh12vrYBVf8b1xGpaFOrE/

# TuxMake is a command line tool and Python library that provides
# portable and repeatable Linux kernel builds across a variety of
# architectures, toolchains, kernel configurations, and make targets.
#
# TuxMake supports the concept of runtimes.
# See https://docs.tuxmake.org/runtimes/, for that to work it requires
# that you install podman or docker on your system.
#
# To install tuxmake on your system globally:
# sudo pip3 install -U tuxmake
#
# See https://docs.tuxmake.org/ for complete documentation.

tuxmake --runtime podman --target-arch x86_64 --toolchain clang-11
--kconfig x86_64_defconfig

ref:
https://builds.tuxbuild.com/1u4ZKFTh12vrYBVf8b1xGpaFOrE/

build info:
git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
git_sha: 7d9c6b8147bdd76d7eb2cf6f74f84c6918ae0939
git_short_log: 7d9c6b8147bd (\Add linux-next specific files for 20210617\)
kconfig: x86_64_defconfig
kernel_image:
kernel_version: 5.13.0-rc6
toolchain: clang-11

--
Linaro LKFT
https://lkft.linaro.org


2021-06-17 16:07:33

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

On Thu, 17 Jun 2021 at 17:41, Naresh Kamboju <[email protected]> wrote:
>
> Linux next 20210617 tag following x86_64 builds failed with clang-10
> and clang-11.
> Regressions found on x86_64:
>
> - build/clang-11-tinyconfig
> - build/clang-11-allnoconfig
> - build/clang-10-tinyconfig
> - build/clang-10-allnoconfig
> - build/clang-11-x86_64_defconfig
> - build/clang-10-defconfig
>
> We are running git bisect to identify the bad commit.
>
> Build log:
> ------------
> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool:
> eb_relocate_parse_slow()+0x466: stack state mismatch: cfa1=4+120
> cfa2=-1+0
> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool:
> eb_copy_relocations()+0x1e0: stack state mismatch: cfa1=4+104
> cfa2=-1+0
> x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry':
> mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

The git bisect pointed out the first bad commit.

The first bad commit:
commit 928cf6adc7d60c96eca760c05c1000cda061604e
Author: Stephen Boyd <[email protected]>
Date: Thu Jun 17 15:21:35 2021 +1000
module: add printk formats to add module build ID to stacktraces

Let's make kernel stacktraces easier to identify by including the build
ID[1] of a module if the stacktrace is printing a symbol from a module.
This makes it simpler for developers to locate a kernel module's full
debuginfo for a particular stacktrace. Combined with
scripts/decode_stracktrace.sh, a developer can download the matching
debuginfo from a debuginfod[2] server and find the exact file and line
number for the functions plus offsets in a stacktrace that match the
module. This is especially useful for pstore crash debugging where the
kernel crashes are recorded in something like console-ramoops and the
recovery kernel/modules are different or the debuginfo doesn't exist on
the device due to space concerns (the debuginfo can be too large for space
limited devices).

Originally, I put this on the %pS format, but that was quickly rejected
given that %pS is used in other places such as ftrace where build IDs
aren't meaningful. There was some discussions on the list to put every
module build ID into the "Modules linked in:" section of the stacktrace
message but that quickly becomes very hard to read once you have more than
three or four modules linked in. It also provides too much information
when we don't expect each module to be traversed in a stacktrace. Having
the build ID for modules that aren't important just makes things messy.
Splitting it to multiple lines for each module quickly explodes the number
of lines printed in an oops too, possibly wrapping the warning off the
console. And finally, trying to stash away each module used in a
callstack to provide the ID of each symbol printed is cumbersome and would
require changes to each architecture to stash away modules and return
their build IDs once unwinding has completed.

Instead, we opt for the simpler approach of introducing new printk formats
'%pS[R]b' for "pointer symbolic backtrace with module build ID" and '%pBb'
for "pointer backtrace with module build ID" and then updating the few
places in the architecture layer where the stacktrace is printed to use
this new format.

Before:

Call trace:
lkdtm_WARNING+0x28/0x30 [lkdtm]
direct_entry+0x16c/0x1b4 [lkdtm]
full_proxy_write+0x74/0xa4
vfs_write+0xec/0x2e8

After:

Call trace:
lkdtm_WARNING+0x28/0x30 [lkdtm 6c2215028606bda50de823490723dc4bc5bf46f9]
direct_entry+0x16c/0x1b4 [lkdtm 6c2215028606bda50de823490723dc4bc5bf46f9]
full_proxy_write+0x74/0xa4
vfs_write+0xec/0x2e8

Link: https://lkml.kernel.org/r/[email protected]
Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1]
Link: https://sourceware.org/elfutils/Debuginfod.html [2]
Signed-off-by: Stephen Boyd <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Evan Green <[email protected]>
Cc: Hsin-Yi Wang <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Konstantin Khlebnikov <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Documentation/core-api/printk-formats.rst | 11 ++++
include/linux/kallsyms.h | 20 +++++-
include/linux/module.h | 8 ++-
kernel/kallsyms.c | 101 ++++++++++++++++++++++++------
kernel/module.c | 31 ++++++++-
lib/vsprintf.c | 8 ++-
6 files changed, 154 insertions(+), 25 deletions(-)
Previous HEAD position was b2dcc0267277 dump_stack: add vmlinux build
ID to stack traces
HEAD is now at 7d9c6b8147bd Add linux-next specific files for 20210617



> make[1]: *** [/builds/linux/Makefile:1252: vmlinux] Error 1
> make[1]: Target '__all' not remade because of errors.
> make: *** [Makefile:222: __sub-make] Error 2
> make: Target '__all' not remade because of errors.
> make --silent --keep-going --jobs=8
> O=/home/tuxbuild/.cache/tuxmake/builds/current ARCH=x86_64
> CROSS_COMPILE=x86_64-linux-gnu- 'HOSTCC=sccache clang' 'CC=sccache
> clang' headers_install
> INSTALL_HDR_PATH=/home/tuxbuild/.cache/tuxmake/builds/current/install_hdr/
> tar caf /home/tuxbuild/.cache/tuxmake/builds/current/headers.tar.xz -C
> /home/tuxbuild/.cache/tuxmake/builds/current/install_hdr .
>
> ref:
> https://builds.tuxbuild.com/1u4ZKFTh12vrYBVf8b1xGpaFOrE/
>
> # TuxMake is a command line tool and Python library that provides
> # portable and repeatable Linux kernel builds across a variety of
> # architectures, toolchains, kernel configurations, and make targets.
> #
> # TuxMake supports the concept of runtimes.
> # See https://docs.tuxmake.org/runtimes/, for that to work it requires
> # that you install podman or docker on your system.
> #
> # To install tuxmake on your system globally:
> # sudo pip3 install -U tuxmake
> #
> # See https://docs.tuxmake.org/ for complete documentation.
>
> tuxmake --runtime podman --target-arch x86_64 --toolchain clang-11
> --kconfig x86_64_defconfig
>
> ref:
> https://builds.tuxbuild.com/1u4ZKFTh12vrYBVf8b1xGpaFOrE/
>
> build info:
> git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
> git_sha: 7d9c6b8147bdd76d7eb2cf6f74f84c6918ae0939
> git_short_log: 7d9c6b8147bd (\Add linux-next specific files for 20210617\)
> kconfig: x86_64_defconfig
> kernel_image:
> kernel_version: 5.13.0-rc6
> toolchain: clang-11

Reported-by: Naresh Kamboju <[email protected]>

> --
> Linaro LKFT
> https://lkft.linaro.org

2021-06-17 16:09:02

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

On Thu, Jun 17, 2021 at 06:15:45PM +0530, Naresh Kamboju wrote:
> On Thu, 17 Jun 2021 at 17:41, Naresh Kamboju <[email protected]> wrote:
> > x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry':
> > mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'
>
> The git bisect pointed out the first bad commit.
>
> The first bad commit:
> commit 928cf6adc7d60c96eca760c05c1000cda061604e
> Author: Stephen Boyd <[email protected]>
> Date: Thu Jun 17 15:21:35 2021 +1000
> module: add printk formats to add module build ID to stacktraces

Your git bisect probably went astray. There's no way that commit
caused that regression.

2021-06-17 16:11:40

by Steven Rostedt

[permalink] [raw]
Subject: Re: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

On Thu, 17 Jun 2021 20:15:13 +0530
Naresh Kamboju <[email protected]> wrote:

> > Your git bisect probably went astray. There's no way that commit
> > caused that regression.
>
> Sorry for pointing to incorrect bad commits coming from git bisect.
>
> Any best way to run git bisect on linux next tree ?
>
> Here is the git bisect log from gitlab pipeline,
> https://gitlab.com/Linaro/lkft/bisect/-/jobs/1354963448

Is it possible that it's not 100% reproducible?

Anyway, before posting the result of any commit as the buggy commit from a
git bisect, it is best to confirm it by:

1) Checking out the tree at the bad commit.
2) Verify that the tree at that point is bad
3) Check out the parent of that commit (the commit before the bad commit
was applied).
4) Verify that the tree at that point is good

May need to repeat the above a couple of times, in case the issue is not
100% reproducible.

If the above is true, then post the patch as the bad commit. If it is not,
then something went wrong with the bisect.

-- Steve

2021-06-17 18:25:02

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

Hi Matthew,

On Thu, 17 Jun 2021 at 19:22, Matthew Wilcox <[email protected]> wrote:
>
> On Thu, Jun 17, 2021 at 06:15:45PM +0530, Naresh Kamboju wrote:
> > On Thu, 17 Jun 2021 at 17:41, Naresh Kamboju <[email protected]> wrote:
> > > x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry':
> > > mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'
> >
> > The git bisect pointed out the first bad commit.
> >
> > The first bad commit:
> > commit 928cf6adc7d60c96eca760c05c1000cda061604e
> > Author: Stephen Boyd <[email protected]>
> > Date: Thu Jun 17 15:21:35 2021 +1000
> > module: add printk formats to add module build ID to stacktraces
>
> Your git bisect probably went astray. There's no way that commit
> caused that regression.

Sorry for pointing to incorrect bad commits coming from git bisect.

Any best way to run git bisect on linux next tree ?

Here is the git bisect log from gitlab pipeline,
https://gitlab.com/Linaro/lkft/bisect/-/jobs/1354963448

- Naresh

2021-06-17 19:50:06

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

Rebuilt the CC list because most people were added based on the
incorrect bisect result.

On Thu, Jun 17, 2021 at 02:51:49PM +0100, Matthew Wilcox wrote:
> On Thu, Jun 17, 2021 at 06:15:45PM +0530, Naresh Kamboju wrote:
> > On Thu, 17 Jun 2021 at 17:41, Naresh Kamboju <[email protected]> wrote:
> > > x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry':
> > > mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'
> >
> > The git bisect pointed out the first bad commit.
> >
> > The first bad commit:
> > commit 928cf6adc7d60c96eca760c05c1000cda061604e
> > Author: Stephen Boyd <[email protected]>
> > Date: Thu Jun 17 15:21:35 2021 +1000
> > module: add printk formats to add module build ID to stacktraces
>
> Your git bisect probably went astray. There's no way that commit
> caused that regression.

My bisect landed on commit 83f85ac75855 ("mm/mremap: convert huge PUD
move to separate helper"). flush_pud_tlb_range() evaluates to
BUILD_BUG() when CONFIG_TRANSPARENT_HUGEPAGE is unset but this function
is present just based on the value of
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.

$ make -skj(nproc) ARCH=x86_64 CC=clang O=build/x86_64 distclean allnoconfig mm/mremap.o

$ llvm-readelf -s build/x86_64/mm/mremap.o &| rg __compiletime_assert
21: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __compiletime_assert_337

$ rg TRANSPARENT_ build/x86_64/.config
450:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
451:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
562:# CONFIG_TRANSPARENT_HUGEPAGE is not set

Not sure why this does not happen on newer clang versions, presumably
something with inlining decisions? Still seems like a legitimate issue
to me.

Cheers,
Nathan

2021-06-18 06:21:02

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

On 6/17/21 11:32 PM, Nathan Chancellor wrote:
> Rebuilt the CC list because most people were added based on the
> incorrect bisect result.
>
> On Thu, Jun 17, 2021 at 02:51:49PM +0100, Matthew Wilcox wrote:
>> On Thu, Jun 17, 2021 at 06:15:45PM +0530, Naresh Kamboju wrote:
>>> On Thu, 17 Jun 2021 at 17:41, Naresh Kamboju <[email protected]> wrote:
>>>> x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry':
>>>> mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'
>>>
>>> The git bisect pointed out the first bad commit.
>>>
>>> The first bad commit:
>>> commit 928cf6adc7d60c96eca760c05c1000cda061604e
>>> Author: Stephen Boyd <[email protected]>
>>> Date: Thu Jun 17 15:21:35 2021 +1000
>>> module: add printk formats to add module build ID to stacktraces
>>
>> Your git bisect probably went astray. There's no way that commit
>> caused that regression.
>
> My bisect landed on commit 83f85ac75855 ("mm/mremap: convert huge PUD
> move to separate helper"). flush_pud_tlb_range() evaluates to
> BUILD_BUG() when CONFIG_TRANSPARENT_HUGEPAGE is unset but this function
> is present just based on the value of
> CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
>
> $ make -skj(nproc) ARCH=x86_64 CC=clang O=build/x86_64 distclean allnoconfig mm/mremap.o
>
> $ llvm-readelf -s build/x86_64/mm/mremap.o &| rg __compiletime_assert
> 21: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __compiletime_assert_337
>
> $ rg TRANSPARENT_ build/x86_64/.config
> 450:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> 451:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
> 562:# CONFIG_TRANSPARENT_HUGEPAGE is not set
>
> Not sure why this does not happen on newer clang versions, presumably
> something with inlining decisions? Still seems like a legitimate issue
> to me.
>

gcc 10 also doesn't give a build error. I guess that is because we evaluate

if (pud_trans_huge(*old_pud) || pud_devmap(*old_pud)) {

to if (0) with CONFIG_TRANSPARENT_HUGEPAGE disabled.

switching that to if (1) do results in BUILD_BUG triggering.

Should we fix this ?

modified mm/mremap.c
@@ -336,7 +336,7 @@ static inline bool move_normal_pud(struct
vm_area_struct *vma,
}
#endif

-#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
+#if defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) &&
defined(CONFIG_TRANSPARENT_HUGEPAGE)
static bool move_huge_pud(struct vm_area_struct *vma, unsigned long
old_addr,
unsigned long new_addr, pud_t *old_pud, pud_t *new_pud)
{

2021-06-19 07:58:13

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

On Fri, Jun 18, 2021 at 10:32:42AM +0530, Aneesh Kumar K.V wrote:
> On 6/17/21 11:32 PM, Nathan Chancellor wrote:
> > Rebuilt the CC list because most people were added based on the
> > incorrect bisect result.
> >
> > On Thu, Jun 17, 2021 at 02:51:49PM +0100, Matthew Wilcox wrote:
> > > On Thu, Jun 17, 2021 at 06:15:45PM +0530, Naresh Kamboju wrote:
> > > > On Thu, 17 Jun 2021 at 17:41, Naresh Kamboju <[email protected]> wrote:
> > > > > x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry':
> > > > > mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'
> > > >
> > > > The git bisect pointed out the first bad commit.
> > > >
> > > > The first bad commit:
> > > > commit 928cf6adc7d60c96eca760c05c1000cda061604e
> > > > Author: Stephen Boyd <[email protected]>
> > > > Date: Thu Jun 17 15:21:35 2021 +1000
> > > > module: add printk formats to add module build ID to stacktraces
> > >
> > > Your git bisect probably went astray. There's no way that commit
> > > caused that regression.
> >
> > My bisect landed on commit 83f85ac75855 ("mm/mremap: convert huge PUD
> > move to separate helper"). flush_pud_tlb_range() evaluates to
> > BUILD_BUG() when CONFIG_TRANSPARENT_HUGEPAGE is unset but this function
> > is present just based on the value of
> > CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
> >
> > $ make -skj(nproc) ARCH=x86_64 CC=clang O=build/x86_64 distclean allnoconfig mm/mremap.o
> >
> > $ llvm-readelf -s build/x86_64/mm/mremap.o &| rg __compiletime_assert
> > 21: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __compiletime_assert_337
> >
> > $ rg TRANSPARENT_ build/x86_64/.config
> > 450:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> > 451:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
> > 562:# CONFIG_TRANSPARENT_HUGEPAGE is not set
> >
> > Not sure why this does not happen on newer clang versions, presumably
> > something with inlining decisions? Still seems like a legitimate issue
> > to me.
> >
>
> gcc 10 also doesn't give a build error. I guess that is because we evaluate
>
> if (pud_trans_huge(*old_pud) || pud_devmap(*old_pud)) {
>
> to if (0) with CONFIG_TRANSPARENT_HUGEPAGE disabled.
>
> switching that to if (1) do results in BUILD_BUG triggering.

Thanks for pointing that out. I think what happens with clang-10 and
clang-11 is that move_huge_pud() gets inlined into move_pgt_entry() but
then the compiler does not figure out that the HPAGE_PUD case is dead so
the code sticks around, where as GCC and newer clang versions can figure
that out and eliminate that case.

> Should we fix this ?

Yes, I believe that we should.

> modified mm/mremap.c
> @@ -336,7 +336,7 @@ static inline bool move_normal_pud(struct vm_area_struct
> *vma,
> }
> #endif
>
> -#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> +#if defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) &&
> defined(CONFIG_TRANSPARENT_HUGEPAGE)
> static bool move_huge_pud(struct vm_area_struct *vma, unsigned long
> old_addr,
> unsigned long new_addr, pud_t *old_pud, pud_t *new_pud)
> {

That works or we could mirror what has already been done for the
HPAGE_PMD case. No personal preference.

diff --git a/mm/mremap.c b/mm/mremap.c
index 9a7fbec31dc9..5989d3990020 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -460,7 +460,8 @@ static bool move_pgt_entry(enum pgt_entry entry, struct vm_area_struct *vma,
new_entry);
break;
case HPAGE_PUD:
- moved = move_huge_pud(vma, old_addr, new_addr, old_entry,
+ moved = IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
+ move_huge_pud(vma, old_addr, new_addr, old_entry,
new_entry);
break;


Cheers,
Nathan

2021-06-23 23:42:47

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

An additional report:
https://lore.kernel.org/lkml/20210623223015.GA315292@paulmck-ThinkPad-P17-Gen-1/
EOM

On Fri, Jun 18, 2021 at 4:05 PM Nathan Chancellor <[email protected]> wrote:
>
> On Fri, Jun 18, 2021 at 10:32:42AM +0530, Aneesh Kumar K.V wrote:
> > On 6/17/21 11:32 PM, Nathan Chancellor wrote:
> > > Rebuilt the CC list because most people were added based on the
> > > incorrect bisect result.
> > >
> > > On Thu, Jun 17, 2021 at 02:51:49PM +0100, Matthew Wilcox wrote:
> > > > On Thu, Jun 17, 2021 at 06:15:45PM +0530, Naresh Kamboju wrote:
> > > > > On Thu, 17 Jun 2021 at 17:41, Naresh Kamboju <[email protected]> wrote:
> > > > > > x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry':
> > > > > > mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'
> > > > >
> > > > > The git bisect pointed out the first bad commit.
> > > > >
> > > > > The first bad commit:
> > > > > commit 928cf6adc7d60c96eca760c05c1000cda061604e
> > > > > Author: Stephen Boyd <[email protected]>
> > > > > Date: Thu Jun 17 15:21:35 2021 +1000
> > > > > module: add printk formats to add module build ID to stacktraces
> > > >
> > > > Your git bisect probably went astray. There's no way that commit
> > > > caused that regression.
> > >
> > > My bisect landed on commit 83f85ac75855 ("mm/mremap: convert huge PUD
> > > move to separate helper"). flush_pud_tlb_range() evaluates to
> > > BUILD_BUG() when CONFIG_TRANSPARENT_HUGEPAGE is unset but this function
> > > is present just based on the value of
> > > CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
> > >
> > > $ make -skj(nproc) ARCH=x86_64 CC=clang O=build/x86_64 distclean allnoconfig mm/mremap.o
> > >
> > > $ llvm-readelf -s build/x86_64/mm/mremap.o &| rg __compiletime_assert
> > > 21: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __compiletime_assert_337
> > >
> > > $ rg TRANSPARENT_ build/x86_64/.config
> > > 450:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> > > 451:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
> > > 562:# CONFIG_TRANSPARENT_HUGEPAGE is not set
> > >
> > > Not sure why this does not happen on newer clang versions, presumably
> > > something with inlining decisions? Still seems like a legitimate issue
> > > to me.
> > >
> >
> > gcc 10 also doesn't give a build error. I guess that is because we evaluate
> >
> > if (pud_trans_huge(*old_pud) || pud_devmap(*old_pud)) {
> >
> > to if (0) with CONFIG_TRANSPARENT_HUGEPAGE disabled.
> >
> > switching that to if (1) do results in BUILD_BUG triggering.
>
> Thanks for pointing that out. I think what happens with clang-10 and
> clang-11 is that move_huge_pud() gets inlined into move_pgt_entry() but
> then the compiler does not figure out that the HPAGE_PUD case is dead so
> the code sticks around, where as GCC and newer clang versions can figure
> that out and eliminate that case.
>
> > Should we fix this ?
>
> Yes, I believe that we should.
>
> > modified mm/mremap.c
> > @@ -336,7 +336,7 @@ static inline bool move_normal_pud(struct vm_area_struct
> > *vma,
> > }
> > #endif
> >
> > -#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> > +#if defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) &&
> > defined(CONFIG_TRANSPARENT_HUGEPAGE)
> > static bool move_huge_pud(struct vm_area_struct *vma, unsigned long
> > old_addr,
> > unsigned long new_addr, pud_t *old_pud, pud_t *new_pud)
> > {
>
> That works or we could mirror what has already been done for the
> HPAGE_PMD case. No personal preference.
>
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 9a7fbec31dc9..5989d3990020 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -460,7 +460,8 @@ static bool move_pgt_entry(enum pgt_entry entry, struct vm_area_struct *vma,
> new_entry);
> break;
> case HPAGE_PUD:
> - moved = move_huge_pud(vma, old_addr, new_addr, old_entry,
> + moved = IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> + move_huge_pud(vma, old_addr, new_addr, old_entry,
> new_entry);
> break;
>
>
> Cheers,
> Nathan



--
Thanks,
~Nick Desaulniers

2021-06-24 00:21:41

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [next] [clang] x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry': mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'

On Wed, Jun 23, 2021 at 04:39:56PM -0700, Nick Desaulniers wrote:
> An additional report:
> https://lore.kernel.org/lkml/20210623223015.GA315292@paulmck-ThinkPad-P17-Gen-1/
> EOM
>
> On Fri, Jun 18, 2021 at 4:05 PM Nathan Chancellor <[email protected]> wrote:
> >
> > On Fri, Jun 18, 2021 at 10:32:42AM +0530, Aneesh Kumar K.V wrote:
> > > On 6/17/21 11:32 PM, Nathan Chancellor wrote:
> > > > Rebuilt the CC list because most people were added based on the
> > > > incorrect bisect result.
> > > >
> > > > On Thu, Jun 17, 2021 at 02:51:49PM +0100, Matthew Wilcox wrote:
> > > > > On Thu, Jun 17, 2021 at 06:15:45PM +0530, Naresh Kamboju wrote:
> > > > > > On Thu, 17 Jun 2021 at 17:41, Naresh Kamboju <[email protected]> wrote:
> > > > > > > x86_64-linux-gnu-ld: mm/mremap.o: in function `move_pgt_entry':
> > > > > > > mremap.c:(.text+0x763): undefined reference to `__compiletime_assert_342'
> > > > > >
> > > > > > The git bisect pointed out the first bad commit.
> > > > > >
> > > > > > The first bad commit:
> > > > > > commit 928cf6adc7d60c96eca760c05c1000cda061604e
> > > > > > Author: Stephen Boyd <[email protected]>
> > > > > > Date: Thu Jun 17 15:21:35 2021 +1000
> > > > > > module: add printk formats to add module build ID to stacktraces
> > > > >
> > > > > Your git bisect probably went astray. There's no way that commit
> > > > > caused that regression.
> > > >
> > > > My bisect landed on commit 83f85ac75855 ("mm/mremap: convert huge PUD
> > > > move to separate helper"). flush_pud_tlb_range() evaluates to
> > > > BUILD_BUG() when CONFIG_TRANSPARENT_HUGEPAGE is unset but this function
> > > > is present just based on the value of
> > > > CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.
> > > >
> > > > $ make -skj(nproc) ARCH=x86_64 CC=clang O=build/x86_64 distclean allnoconfig mm/mremap.o
> > > >
> > > > $ llvm-readelf -s build/x86_64/mm/mremap.o &| rg __compiletime_assert
> > > > 21: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __compiletime_assert_337
> > > >
> > > > $ rg TRANSPARENT_ build/x86_64/.config
> > > > 450:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
> > > > 451:CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
> > > > 562:# CONFIG_TRANSPARENT_HUGEPAGE is not set
> > > >
> > > > Not sure why this does not happen on newer clang versions, presumably
> > > > something with inlining decisions? Still seems like a legitimate issue
> > > > to me.
> > > >
> > >
> > > gcc 10 also doesn't give a build error. I guess that is because we evaluate
> > >
> > > if (pud_trans_huge(*old_pud) || pud_devmap(*old_pud)) {
> > >
> > > to if (0) with CONFIG_TRANSPARENT_HUGEPAGE disabled.
> > >
> > > switching that to if (1) do results in BUILD_BUG triggering.
> >
> > Thanks for pointing that out. I think what happens with clang-10 and
> > clang-11 is that move_huge_pud() gets inlined into move_pgt_entry() but
> > then the compiler does not figure out that the HPAGE_PUD case is dead so
> > the code sticks around, where as GCC and newer clang versions can figure
> > that out and eliminate that case.
> >
> > > Should we fix this ?
> >
> > Yes, I believe that we should.
> >
> > > modified mm/mremap.c
> > > @@ -336,7 +336,7 @@ static inline bool move_normal_pud(struct vm_area_struct
> > > *vma,
> > > }
> > > #endif
> > >
> > > -#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> > > +#if defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) &&
> > > defined(CONFIG_TRANSPARENT_HUGEPAGE)
> > > static bool move_huge_pud(struct vm_area_struct *vma, unsigned long
> > > old_addr,
> > > unsigned long new_addr, pud_t *old_pud, pud_t *new_pud)
> > > {

Making the above change does the trick for my repeat-by, thank you!

> > That works or we could mirror what has already been done for the
> > HPAGE_PMD case. No personal preference.
> >
> > diff --git a/mm/mremap.c b/mm/mremap.c
> > index 9a7fbec31dc9..5989d3990020 100644
> > --- a/mm/mremap.c
> > +++ b/mm/mremap.c
> > @@ -460,7 +460,8 @@ static bool move_pgt_entry(enum pgt_entry entry, struct vm_area_struct *vma,
> > new_entry);
> > break;
> > case HPAGE_PUD:
> > - moved = move_huge_pud(vma, old_addr, new_addr, old_entry,
> > + moved = IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > + move_huge_pud(vma, old_addr, new_addr, old_entry,
> > new_entry);
> > break;

This one is already in -next, but you knew that already. I am happy to
test the resulting patch, when and if.

Thanx, Paul