This patch changes the criteria for including vm regions in a core dump.
In recent glibc, the dynamic linker uses mprotect on part of a data segment
to write-protect pages that should never be touched after startup time
(this makes it harder for exploits to clobber indirection tables and the like).
Currently, this part of the segment is omitted from core dumps, losing
information about what the program did before it died.
The comment about the old VM_READ test no longer applies since writing uses
kmap now. Including unreadable regions gives you things like guard pages,
showing an accurate representation of of those in the core dump image.
Since we now have the ZERO_PAGE check, this won't actually write any more
pages to disk for those cases.
Any anonymous memory should always be dumped, since there is by definition
no other source of that information available after the process has died.
File-backed memory that is MAP_PRIVATE and has been touched should also be
dumped, since this is in essence an anonymous memory region. This is the
case that covers RELRO segments (data pages mprotect'd to read-only after
dynamic linker startup).
Thanks,
Roland
Signed-off-by: Roland McGrath <[email protected]>
Index: linux-2.6/fs/binfmt_elf.c
===================================================================
RCS file: /home/roland/redhat/bkcvs/linux-2.5/fs/binfmt_elf.c,v
retrieving revision 1.89
diff -B -p -u -r1.89 binfmt_elf.c
--- linux-2.6/fs/binfmt_elf.c 23 Sep 2004 01:24:06 -0000 1.89
+++ linux-2.6/fs/binfmt_elf.c 28 Sep 2004 00:16:26 -0000
@@ -1054,23 +1054,16 @@ static int dump_seek(struct file *file,
*/
static int maydump(struct vm_area_struct *vma)
{
- /*
- * If we may not read the contents, don't allow us to dump
- * them either. "dump_write()" can't handle it anyway.
- */
- if (!(vma->vm_flags & VM_READ))
- return 0;
-
/* Do not dump I/O mapped devices! -DaveM */
if (vma->vm_flags & VM_IO)
return 0;
-#if 1
if (vma->vm_flags & (VM_WRITE|VM_GROWSUP|VM_GROWSDOWN))
return 1;
- if (vma->vm_flags & (VM_READ|VM_EXEC|VM_EXECUTABLE|VM_SHARED))
- return 0;
-#endif
- return 1;
+ if (vma->vm_file == NULL) /* Anonymous memory is always dumped. */
+ return 1;
+ if (vma->anon_vma != NULL) /* We've had some COW here. */
+ return 1;
+ return 0;
}
#define roundup(x, y) ((((x)+((y)-1))/(y))*(y))
> Augh. Not all filesystems support holes,
We could represent the holes in ELF format instead. When a hole is at the
tail end of a vma, just reducing p_memsz does the trick. For holes in the
middle, we can make more ELF phdrs rather than one per vma.
> and I think VM_SHARED should trump VM_WRITE.
If VM_SHARED is never set when COW could have happened, then fine.
> But any non-dumped section _will_ show up in the ELF headers, so things
> like guard pages have nothing to do with "maydump()", imho.
Guard pages are indeed not truly an interesting case, since they are always
zero anyway. But in the general case, PROT_NONE is no guarantee that there
isn't worthwhile data there. e.g., some GC schemes use page protections
temporarily on parts of their heap--if it crashes in the middle of that,
you want that heap page's contents dumped even though at that instant the
page was not readable.
> In general, I'd much prefer just adding a VM_DIRTY flag, and making the
> COW logic set it. That should guarantee that we write out the minimal
> required sections.
I got the impression that vma->anon_vma being set when vma->vm_file is also
set means that COW happened. Would VM_DIRTY not be set in the same set of
cases?
Thanks,
Roland
On Wed, 20 Oct 2004, Roland McGrath wrote:
>
> This patch changes the criteria for including vm regions in a core dump.
> In recent glibc, the dynamic linker uses mprotect on part of a data segment
> to write-protect pages that should never be touched after startup time
> (this makes it harder for exploits to clobber indirection tables and the like).
> Currently, this part of the segment is omitted from core dumps, losing
> information about what the program did before it died.
Augh. Not all filesystems support holes, and I think VM_SHARED should
trump VM_WRITE.
In general, I'd much prefer just adding a VM_DIRTY flag, and making the
COW logic set it. That should guarantee that we write out the minimal
required sections.
> Including unreadable regions gives you things like guard pages, showing
> an accurate representation of of those in the core dump image. Since we
> now have the ZERO_PAGE check, this won't actually write any more pages
> to disk for those cases.
But any non-dumped section _will_ show up in the ELF headers, so things
like guard pages have nothing to do with "maydump()", imho. And as
mentioned, some filesystems _will_ write many more pages.
So if gdb has trouble with guard pages, pls get somebody to fix gdb
instead, and tell them what the difference between p_filesz and p_memsz
is. Putting them into the dump is just stupid.
So "maydump()" might sanely look something like
/* Shared or IO mappings are never written out */
if (vma->vm_flags & (VM_IO | VM_SHARED | VM_SHM))
return 0;
/*
* Was the mapped backing store opened for writing?
* This really only happens for VM_SHARED (which we
* don't write out anyway), but it's conceptually
* the right thing to do. Some future internal use
* might end up doing this.
*/
if (!(vma->vm_flags & VM_MAYWRITE))
return 0;
/* We really should add this flag */
if (!(vma->vm_flags & VM_DIRTY))
return 0;
/* otherwise, write it out */
return 1;
and that should make people happy. No?
Linus
On Wed, 20 Oct 2004, Linus Torvalds wrote:
>
> In general, I'd much prefer just adding a VM_DIRTY flag, and making the
> COW logic set it. That should guarantee that we write out the minimal
> required sections.
VM_DIRTY flag would be problematic, since you need mmap_sem exclusively
to modify vm_flags, whereas faulting holds mmap_sem non-exclusively.
But if I understand you aright, testing vma->anon_vma should do it.
> So "maydump()" might sanely look something like
>
> /* Shared or IO mappings are never written out */
> if (vma->vm_flags & (VM_IO | VM_SHARED | VM_SHM))
> return 0;
VM_SHM is a relic, there might be a driver or two still setting it,
but we should have deleted it long ago: you mean VM_RESERVED, I think.
> /*
> * Was the mapped backing store opened for writing?
> * This really only happens for VM_SHARED (which we
> * don't write out anyway), but it's conceptually
> * the right thing to do. Some future internal use
> * might end up doing this.
> */
> if (!(vma->vm_flags & VM_MAYWRITE))
> return 0;
Conceptually the right thing to do? I think it's best left out until
that future internal use makes sense of it. As it stands, I can't work
out what it's up to; and VM_MAYWRITE doesn't say if the backing store
was opened for writing, but whether the memory may be mprotected writable.
> /* We really should add this flag */
> if (!(vma->vm_flags & VM_DIRTY))
> return 0;
if (!vma->anon_vma)
return 0;
> /* otherwise, write it out */
> return 1;
Hugh
On Thu, 21 Oct 2004, Hugh Dickins wrote:
>
> if (!vma->anon_vma)
> return 0;
Ok. So the end result ends up pretty simple:
static int maydump(struct vm_area_struct *vma)
{
/* Do not dump I/O mapped devices, shared memory, or special mappings */
if (vma->vm_flags & (VM_IO | VM_SHARED | VM_RESERVED))
return 0;
/* If it hasn't been written to, don't write it out */
if (!vma->anon_vma)
return 0;
return 1;
}
does this make everybody happy?
Do the core-files blow up a lot from this? Or does it miss some case?
Linus
>
>
> On Thu, 21 Oct 2004, Hugh Dickins wrote:
> >
> > if (!vma->anon_vma)
> > return 0;
>
> Ok. So the end result ends up pretty simple:
>
> static int maydump(struct vm_area_struct *vma)
> {
> /* Do not dump I/O mapped devices, shared memory, or special mappings */
> if (vma->vm_flags & (VM_IO | VM_SHARED | VM_RESERVED))
> return 0;
>
> /* If it hasn't been written to, don't write it out */
> if (!vma->anon_vma)
> return 0;
>
> return 1;
> }
>
> does this make everybody happy?
I expect that has the same result as my patch (which tested vma->vm_file
instead of vma->vm_flags & VM_SHARED). It produces the same good results
on the simple tests I've done (for glibc's RELRO segments).
Thanks,
Roland