2022-01-26 16:20:28

by Jann Horn

[permalink] [raw]
Subject: [PATCH] coredump: Also dump first pages of non-executable ELF libraries

When I rewrote the VMA dumping logic for coredumps, I changed it to
recognize ELF library mappings based on the file being executable instead
of the mapping having an ELF header. But turns out, distros ship many ELF
libraries as non-executable, so the heuristic goes wrong...

Restore the old behavior where FILTER(ELF_HEADERS) dumps the first page of
any offset-0 readable mapping that starts with the ELF magic.

This fix is technically layer-breaking a bit, because it checks for
something ELF-specific in fs/coredump.c; but since we probably want to
share this between standard ELF and FDPIC ELF anyway, I guess it's fine?
And this also keeps the change small for backporting.

Cc: [email protected]
Fixes: 429a22e776a2 ("coredump: rework elf/elf_fdpic vma_dump_size() into common helper")
Reported-by: Bill Messmer <[email protected]>
Signed-off-by: Jann Horn <[email protected]>
---

@Bill: If you happen to have a kernel tree lying around, you could give
this a try and report back whether this solves your issues?
But if not, it's also fine, I've tested myself that with this patch
applied, the first 0x1000 bytes of non-executable libraries are dumped
into the coredump according to "readelf".

fs/coredump.c | 39 ++++++++++++++++++++++++++++++++++-----
1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index 1c060c0a2d72..b73817712dd2 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -42,6 +42,7 @@
#include <linux/path.h>
#include <linux/timekeeping.h>
#include <linux/sysctl.h>
+#include <linux/elf.h>

#include <linux/uaccess.h>
#include <asm/mmu_context.h>
@@ -980,6 +981,8 @@ static bool always_dump_vma(struct vm_area_struct *vma)
return false;
}

+#define DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER 1
+
/*
* Decide how much of @vma's contents should be included in a core dump.
*/
@@ -1039,9 +1042,20 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
* dump the first page to aid in determining what was mapped here.
*/
if (FILTER(ELF_HEADERS) &&
- vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ) &&
- (READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0)
- return PAGE_SIZE;
+ vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ)) {
+ if ((READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0)
+ return PAGE_SIZE;
+
+ /*
+ * ELF libraries aren't always executable.
+ * We'll want to check whether the mapping starts with the ELF
+ * magic, but not now - we're holding the mmap lock,
+ * so copy_from_user() doesn't work here.
+ * Use a placeholder instead, and fix it up later in
+ * dump_vma_snapshot().
+ */
+ return DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER;
+ }

#undef FILTER

@@ -1116,8 +1130,6 @@ int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count,
m->end = vma->vm_end;
m->flags = vma->vm_flags;
m->dump_size = vma_dump_size(vma, cprm->mm_flags);
-
- vma_data_size += m->dump_size;
}

mmap_write_unlock(mm);
@@ -1127,6 +1139,23 @@ int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count,
return -EFAULT;
}

+ for (i = 0; i < *vma_count; i++) {
+ struct core_vma_metadata *m = (*vma_meta) + i;
+
+ if (m->dump_size == DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER) {
+ char elfmag[SELFMAG];
+
+ if (copy_from_user(elfmag, (void __user *)m->start, SELFMAG) ||
+ memcmp(elfmag, ELFMAG, SELFMAG) != 0) {
+ m->dump_size = 0;
+ } else {
+ m->dump_size = PAGE_SIZE;
+ }
+ }
+
+ vma_data_size += m->dump_size;
+ }
+
*vma_data_size_ptr = vma_data_size;
return 0;
}

base-commit: 0280e3c58f92b2fe0e8fbbdf8d386449168de4a8
--
2.35.0.rc0.227.g00780c9af4-goog


2022-01-27 01:13:41

by Bill Messmer

[permalink] [raw]
Subject: RE: [EXTERNAL] [PATCH] coredump: Also dump first pages of non-executable ELF libraries

Jann,

Apologies on the delay... I think it's probably been 20+ years since I've built and installed a Linux kernel. In any case, I cloned the current kernel git tree, applied your patch, rebuilt the kernel, and installed it in an Ubuntu 21.10 VM. After forcing a few process core dumps, it does indeed look like the problem is fixed. Just to triple check, I took one of those core dumps over to the Windows side and opened it with a recent windbg. It finds the build-ids of all the relevant images & SO's just fine:

0:000> dx @$curprocess.Modules.Select(mod => mod.Contents.BuildID)
@$curprocess.Modules.Select(mod => mod.Contents.BuildID)
[0x0] : Unable to read target memory at '0x7f5766631000' in method 'readMemoryValues' [at ImageInfo (line 1275 col 5)]
[0x1] : EF650611451904165E9CAF6080ECBAAD50B84D3F
[0x2] : 674ACF7BFECD6B8F382FE8D0D95F229087761289
[0x3] : C087D7951738C9EA3DFC7D15A7B31A7D7F862AE1
[0x4] : B8037B6260865346802321DD2256B8AD1D857E63
[0x5] : DB6AFCCC2EC0090045BBE5DDD68722A1434235E5
[0x6] : 3B4B1D0BA98C1B4081A6C5748A593D42C163F125
[0x7] : 4501188BC2E25791E446F7C110F8BC9282C98CD4
[0x8] : AE398331C90E9C84AE01A640DF017803BB775F63
[0x9] : 4E8C3A67A9606B9B33EDF9A24ED999E3C885E5BB
[0xa] : 6511403115C9BC3DF0DCD7167D8766B7FCC2AEE1
[0xb] : 14ACB10BBDAEFC6A64890C96417426CA820C0FAA
[0xc] : 2792043473EB7D1661942BC13DB9272918D2A790

And it is able to match the images/debug information to what I have for my Ubuntu VM as well.

Thank you for the fix!

Sincerely,

Bill Messmer
[email protected]

-----Original Message-----
From: Jann Horn <[email protected]>
Sent: Tuesday, January 25, 2022 6:58 PM
To: Andrew Morton <[email protected]>
Cc: [email protected]; Bill Messmer <[email protected]>; Eric W . Biederman <[email protected]>; Al Viro <[email protected]>; Randy Dunlap <[email protected]>; Jann Horn <[email protected]>; [email protected]
Subject: [EXTERNAL] [PATCH] coredump: Also dump first pages of non-executable ELF libraries

[You don't often get email from [email protected]. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]

When I rewrote the VMA dumping logic for coredumps, I changed it to recognize ELF library mappings based on the file being executable instead of the mapping having an ELF header. But turns out, distros ship many ELF libraries as non-executable, so the heuristic goes wrong...

Restore the old behavior where FILTER(ELF_HEADERS) dumps the first page of any offset-0 readable mapping that starts with the ELF magic.

This fix is technically layer-breaking a bit, because it checks for something ELF-specific in fs/coredump.c; but since we probably want to share this between standard ELF and FDPIC ELF anyway, I guess it's fine?
And this also keeps the change small for backporting.

Cc: [email protected]
Fixes: 429a22e776a2 ("coredump: rework elf/elf_fdpic vma_dump_size() into common helper")
Reported-by: Bill Messmer <[email protected]>
Signed-off-by: Jann Horn <[email protected]>
---

@Bill: If you happen to have a kernel tree lying around, you could give this a try and report back whether this solves your issues?
But if not, it's also fine, I've tested myself that with this patch applied, the first 0x1000 bytes of non-executable libraries are dumped into the coredump according to "readelf".

fs/coredump.c | 39 ++++++++++++++++++++++++++++++++++-----
1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/fs/coredump.c b/fs/coredump.c index 1c060c0a2d72..b73817712dd2 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -42,6 +42,7 @@
#include <linux/path.h>
#include <linux/timekeeping.h>
#include <linux/sysctl.h>
+#include <linux/elf.h>

#include <linux/uaccess.h>
#include <asm/mmu_context.h>
@@ -980,6 +981,8 @@ static bool always_dump_vma(struct vm_area_struct *vma)
return false;
}

+#define DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER 1
+
/*
* Decide how much of @vma's contents should be included in a core dump.
*/
@@ -1039,9 +1042,20 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
* dump the first page to aid in determining what was mapped here.
*/
if (FILTER(ELF_HEADERS) &&
- vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ) &&
- (READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0)
- return PAGE_SIZE;
+ vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ)) {
+ if ((READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0)
+ return PAGE_SIZE;
+
+ /*
+ * ELF libraries aren't always executable.
+ * We'll want to check whether the mapping starts with the ELF
+ * magic, but not now - we're holding the mmap lock,
+ * so copy_from_user() doesn't work here.
+ * Use a placeholder instead, and fix it up later in
+ * dump_vma_snapshot().
+ */
+ return DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER;
+ }

#undef FILTER

@@ -1116,8 +1130,6 @@ int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count,
m->end = vma->vm_end;
m->flags = vma->vm_flags;
m->dump_size = vma_dump_size(vma, cprm->mm_flags);
-
- vma_data_size += m->dump_size;
}

mmap_write_unlock(mm);
@@ -1127,6 +1139,23 @@ int dump_vma_snapshot(struct coredump_params *cprm, int *vma_count,
return -EFAULT;
}

+ for (i = 0; i < *vma_count; i++) {
+ struct core_vma_metadata *m = (*vma_meta) + i;
+
+ if (m->dump_size == DUMP_SIZE_MAYBE_ELFHDR_PLACEHOLDER) {
+ char elfmag[SELFMAG];
+
+ if (copy_from_user(elfmag, (void __user *)m->start, SELFMAG) ||
+ memcmp(elfmag, ELFMAG, SELFMAG) != 0) {
+ m->dump_size = 0;
+ } else {
+ m->dump_size = PAGE_SIZE;
+ }
+ }
+
+ vma_data_size += m->dump_size;
+ }
+
*vma_data_size_ptr = vma_data_size;
return 0;
}

base-commit: 0280e3c58f92b2fe0e8fbbdf8d386449168de4a8
--
2.35.0.rc0.227.g00780c9af4-goog

2022-02-02 18:13:51

by Jann Horn

[permalink] [raw]
Subject: Re: [PATCH] coredump: Also dump first pages of non-executable ELF libraries

On Wed, Feb 2, 2022 at 4:19 PM Eric W. Biederman <[email protected]> wrote:
>
> Jann Horn <[email protected]> writes:
>
> > When I rewrote the VMA dumping logic for coredumps, I changed it to
> > recognize ELF library mappings based on the file being executable instead
> > of the mapping having an ELF header. But turns out, distros ship many ELF
> > libraries as non-executable, so the heuristic goes wrong...
> >
> > Restore the old behavior where FILTER(ELF_HEADERS) dumps the first page of
> > any offset-0 readable mapping that starts with the ELF magic.
> >
> > This fix is technically layer-breaking a bit, because it checks for
> > something ELF-specific in fs/coredump.c; but since we probably want to
> > share this between standard ELF and FDPIC ELF anyway, I guess it's fine?
> > And this also keeps the change small for backporting.
>
> In light of the conflict with my other changes, and in light of the pain
> of calling get_user.
>
> Is there any reason why the doesn't unconditionally dump all headers?
> Something like the diff below?
>
> I looked in the history and the code was filtering for ELF headers
> there already. I am just thinking this feels like a good idea
> regardless of the file format to help verify the file on-disk
> is the file we think was mapped.

Yeah, I guess that's reasonable. The main difference will probably be
that the starting pages of some font files, locale files and python
libraries will also be logged.

2022-02-04 23:56:27

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] coredump: Also dump first pages of non-executable ELF libraries

Jann Horn <[email protected]> writes:

> When I rewrote the VMA dumping logic for coredumps, I changed it to
> recognize ELF library mappings based on the file being executable instead
> of the mapping having an ELF header. But turns out, distros ship many ELF
> libraries as non-executable, so the heuristic goes wrong...
>
> Restore the old behavior where FILTER(ELF_HEADERS) dumps the first page of
> any offset-0 readable mapping that starts with the ELF magic.
>
> This fix is technically layer-breaking a bit, because it checks for
> something ELF-specific in fs/coredump.c; but since we probably want to
> share this between standard ELF and FDPIC ELF anyway, I guess it's fine?
> And this also keeps the change small for backporting.

In light of the conflict with my other changes, and in light of the pain
of calling get_user.

Is there any reason why the doesn't unconditionally dump all headers?
Something like the diff below?

I looked in the history and the code was filtering for ELF headers
there already. I am just thinking this feels like a good idea
regardless of the file format to help verify the file on-disk
is the file we think was mapped.

Eric

diff --git a/fs/coredump.c b/fs/coredump.c
index 6a97a8ea7295..ef3b03e4cf59 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -1047,8 +1047,7 @@ static unsigned long vma_dump_size(struct vm_area_struct *vma,
* dump the first page to aid in determining what was mapped here.
*/
if (FILTER(ELF_HEADERS) &&
- vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ) &&
- (READ_ONCE(file_inode(vma->vm_file)->i_mode) & 0111) != 0)
+ vma->vm_pgoff == 0 && (vma->vm_flags & VM_READ))
return PAGE_SIZE;

#undef FILTER


2022-02-06 08:22:37

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] coredump: Also dump first pages of non-executable ELF libraries

On Fri, 4 Feb 2022 00:59:59 +0100 Jann Horn <[email protected]> wrote:

> > > I looked in the history and the code was filtering for ELF headers
> > > there already. I am just thinking this feels like a good idea
> > > regardless of the file format to help verify the file on-disk
> > > is the file we think was mapped.
> >
> > Yeah, I guess that's reasonable. The main difference will probably be
> > that the starting pages of some font files, locale files and python
> > libraries will also be logged.
>
> Are you planning to turn that into a proper patch and take it through
> your tree, or something like that? If so, we should tell akpm to take
> this one back out...

I have
coredump-also-dump-first-pages-of-non-executable-elf-libraries.patch on
hold, pending outcome of this discussion.


2022-02-09 09:44:04

by Jann Horn

[permalink] [raw]
Subject: Re: [PATCH] coredump: Also dump first pages of non-executable ELF libraries

On Wed, Feb 2, 2022 at 6:43 PM Jann Horn <[email protected]> wrote:
>
> On Wed, Feb 2, 2022 at 4:19 PM Eric W. Biederman <[email protected]> wrote:
> >
> > Jann Horn <[email protected]> writes:
> >
> > > When I rewrote the VMA dumping logic for coredumps, I changed it to
> > > recognize ELF library mappings based on the file being executable instead
> > > of the mapping having an ELF header. But turns out, distros ship many ELF
> > > libraries as non-executable, so the heuristic goes wrong...
> > >
> > > Restore the old behavior where FILTER(ELF_HEADERS) dumps the first page of
> > > any offset-0 readable mapping that starts with the ELF magic.
> > >
> > > This fix is technically layer-breaking a bit, because it checks for
> > > something ELF-specific in fs/coredump.c; but since we probably want to
> > > share this between standard ELF and FDPIC ELF anyway, I guess it's fine?
> > > And this also keeps the change small for backporting.
> >
> > In light of the conflict with my other changes, and in light of the pain
> > of calling get_user.
> >
> > Is there any reason why the doesn't unconditionally dump all headers?
> > Something like the diff below?
> >
> > I looked in the history and the code was filtering for ELF headers
> > there already. I am just thinking this feels like a good idea
> > regardless of the file format to help verify the file on-disk
> > is the file we think was mapped.
>
> Yeah, I guess that's reasonable. The main difference will probably be
> that the starting pages of some font files, locale files and python
> libraries will also be logged.

Are you planning to turn that into a proper patch and take it through
your tree, or something like that? If so, we should tell akpm to take
this one back out...