Hello,
It has been my understanding for some time that the kernel config option CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS (and the corresponding bit 4 of the coredump filter) was, at one point, added for the purpose of ensuring that the GNU build-id of ELF objects was included in core dumps.? The config description in Kconfig.binfmt even alludes to this in its description.
I am trying to understand why in the 5.10+ kernels, there was a change in the kernel that, instead of checking whether a given memory mapping had an ELF header in order to determine whether to include the page to checking whether the inode is executable.? The change in question:
github.com/torvalds/linux/commit/429a22e776a2b9f85a2b9c53d8e647598b553dd1
In many distributions (e.g.: Ubuntu), the shared objects in /usr/lib and elsewhere are not marked as executable.? One of the net effects here is that the first page of shared objects on these distributions are no longer captured in core dumps.
A core dump taken on Ubuntu 21.10 (with the 5.13 kernel) will, by default, not include these pages:
? LOAD?????????? 0x0000000000007000 0x00007f375855f000 0x0000000000000000
???????????????? 0x0000000000000000 0x000000000002c000? R????? 0x1000
?? 0x00007f375855f000? 0x00007f375858b000? 0x0000000000000000
??????? /usr/lib/x86_64-linux-gnu/libc.so.6
Doing a quick "sudo chmod +x /usr/lib/x86_64-linux-gnu/libc.so.6" and repeating shows that it is:
? LOAD?????????? 0x0000000000007000 0x00007fefd5282000 0x0000000000000000
???????????????? 0x0000000000001000 0x000000000002c000? R????? 0x1000
??? 0x00007fefd5282000? 0x00007fefd52ae000? 0x0000000000000000
??????? /usr/lib/x86_64-linux-gnu/libc.so.6
Prior to running with 5.10+ kernels, I was always seeing the first page of shared objects (and the contained build-id) within core dumps (assuming the proper kernel config and core dump filter bits).? Not any longer.
The reason I ask this is that, as more teams here at Microsoft have products running on Linux (or in Linux containers), we have been pushing the crash reports for those up through the same post-mortem crash analysis infrastructure that we do for Windows.? That means that what has traditionally been the Windows debugger (e.g.: WinDbg) has, for some time, been able to open, debug, and analyze various Linux post-mortem crash formats.? Part of doing this on a post-mortem basis requires finding the original images and debug information for the executables and shared objects referenced in those core dumps.? Whether we do that via our own symbol servers or via a debuginfod service -- the post-mortem debugger needs access to the build-ids of those objects.
Until recently, finding these from a core dump has been stable and working quite well.? Of late, however, we have been seeing a number of crash reports (e.g.: from Debian or Ubuntu containers) where we can no longer find images & symbols based on the core dumps because this kernel change has caused the first page of shared object files to not be captured in core dumps.? I don't know how many post-mortem Linux crash analysis solutions this is affecting...?
Was the change here really the intent...?? or is this a kernel bug?
Sincerely,
Bill Messmer
[email protected]
[add the patch author, Jann]
On 1/20/22 17:31, Bill Messmer wrote:
> Hello,
>
> It has been my understanding for some time that the kernel config option CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS (and the corresponding bit 4 of the coredump filter) was, at one point, added for the purpose of ensuring that the GNU build-id of ELF objects was included in core dumps. The config description in Kconfig.binfmt even alludes to this in its description.
>
> I am trying to understand why in the 5.10+ kernels, there was a change in the kernel that, instead of checking whether a given memory mapping had an ELF header in order to determine whether to include the page to checking whether the inode is executable. The change in question:
>
> github.com/torvalds/linux/commit/429a22e776a2b9f85a2b9c53d8e647598b553dd1
>
Bill,
You should send email(s) to the relevant people if you can identify them.
LKML is a huge pipe (hose) and people don't normally browse it. :)
> In many distributions (e.g.: Ubuntu), the shared objects in /usr/lib and elsewhere are not marked as executable. One of the net effects here is that the first page of shared objects on these distributions are no longer captured in core dumps.
>
> A core dump taken on Ubuntu 21.10 (with the 5.13 kernel) will, by default, not include these pages:
>
> LOAD 0x0000000000007000 0x00007f375855f000 0x0000000000000000
> 0x0000000000000000 0x000000000002c000 R 0x1000
>
> 0x00007f375855f000 0x00007f375858b000 0x0000000000000000
> /usr/lib/x86_64-linux-gnu/libc.so.6
>
> Doing a quick "sudo chmod +x /usr/lib/x86_64-linux-gnu/libc.so.6" and repeating shows that it is:
>
> LOAD 0x0000000000007000 0x00007fefd5282000 0x0000000000000000
> 0x0000000000001000 0x000000000002c000 R 0x1000
>
> 0x00007fefd5282000 0x00007fefd52ae000 0x0000000000000000
> /usr/lib/x86_64-linux-gnu/libc.so.6
>
> Prior to running with 5.10+ kernels, I was always seeing the first page of shared objects (and the contained build-id) within core dumps (assuming the proper kernel config and core dump filter bits). Not any longer.
>
> The reason I ask this is that, as more teams here at Microsoft have products running on Linux (or in Linux containers), we have been pushing the crash reports for those up through the same post-mortem crash analysis infrastructure that we do for Windows. That means that what has traditionally been the Windows debugger (e.g.: WinDbg) has, for some time, been able to open, debug, and analyze various Linux post-mortem crash formats. Part of doing this on a post-mortem basis requires finding the original images and debug information for the executables and shared objects referenced in those core dumps. Whether we do that via our own symbol servers or via a debuginfod service -- the post-mortem debugger needs access to the build-ids of those objects.
>
> Until recently, finding these from a core dump has been stable and working quite well. Of late, however, we have been seeing a number of crash reports (e.g.: from Debian or Ubuntu containers) where we can no longer find images & symbols based on the core dumps because this kernel change has caused the first page of shared object files to not be captured in core dumps. I don't know how many post-mortem Linux crash analysis solutions this is affecting...
>
> Was the change here really the intent...? or is this a kernel bug?
>
> Sincerely,
>
> Bill Messmer
> [email protected]
--
~Randy
On Fri, Jan 21, 2022 at 7:18 PM Randy Dunlap <[email protected]> wrote:
> On 1/20/22 17:31, Bill Messmer wrote:
> > Hello,
> >
> > It has been my understanding for some time that the kernel config option CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS (and the corresponding bit 4 of the coredump filter) was, at one point, added for the purpose of ensuring that the GNU build-id of ELF objects was included in core dumps. The config description in Kconfig.binfmt even alludes to this in its description.
> >
> > I am trying to understand why in the 5.10+ kernels, there was a change in the kernel that, instead of checking whether a given memory mapping had an ELF header in order to determine whether to include the page to checking whether the inode is executable. The change in question:
> >
> > github.com/torvalds/linux/commit/429a22e776a2b9f85a2b9c53d8e647598b553dd1
As the commit message says, it was an attempt to avoid a deadlock
without making the code overly complicated. Clearly that didn't go as
planned...
> > In many distributions (e.g.: Ubuntu), the shared objects in /usr/lib and elsewhere are not marked as executable.
Urgh, crap. I'm looking around on my Debian box now, and I also see
that some libraries (like ld.so and libc) are marked executable, but
many others are not...
[...]
> > Was the change here really the intent...? or is this a kernel bug?
Yeah, that's a bug. Linus suggested it as a way to simplify my
original patch (https://lore.kernel.org/all/CAHk-=wiOqR-4jXpPe-5PBKSCwQQFDaiJwkJr6ULwhcN8DJoG0A@mail.gmail.com/)
and it seemed like a good idea to me...
I guess the good news is that the original patch
https://lore.kernel.org/all/[email protected]/
already has the code for doing it properly, so it should be pretty
straightforward to fix this by just pasting over some bits from the
old patch... I'll try to get around to that soon.
This would be so much nicer if the kernel actually knew what is a
library mapping and what isn't... oh well.