LinuxLists.cc - 3.10.9: Oops at elf_core

2013-08-29 21:46:43

Subject: 3.10.9: Oops at elf_core_dump()

Hi,
I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
file and looking for ELF gave me nothing. ;-)

[105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
[105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
[105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
[105670.434401] Oops: 0000 [#1] SMP
[105670.434413] Modules linked in: iwldvm iwlwifi
[105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8
[105670.434451] Hardware name: Dell Inc. Vostro 3550/, BIOS A11 08/03/2012
[105670.434468] task: ffff88037df42f70 ti: ffff88018683c000 task.ti: ffff88018683c000
[105670.434487] RIP: 0010:[<ffffffff812f7b42>] [<ffffffff812f7b42>] strlen+0x2/0x20
[105670.434509] RSP: 0018:ffff88018683d9f0 EFLAGS: 00010246
[105670.434523] RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff88037df42f70
[105670.434542] RDX: 00000000016e3610 RSI: 0000000000000000 RDI: 0000000000000000
[105670.434560] RBP: ffff88018683da08 R08: 0000000000000000 R09: 0000000000000000
[105670.434579] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88018683db00
[105670.434598] R13: 00007ffffffff000 R14: 0000000000000004 R15: 0000000000000000
[105670.434617] FS: 00007f89b0989740(0000) GS:ffff88041d800000(0000) knlGS:0000000000000000
[105670.434637] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[105670.434652] CR2: 0000000000000000 CR3: 00000002b4c06000 CR4: 00000000000407f0
[105670.434671] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[105670.434690] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[105670.434708] Stack:
[105670.434715] ffffffff811bb0a3 0000000000000004 00000000000003d8 ffff88018683dc08
[105670.434738] ffffffff811bbcbd ffffffff811bb913 0000000000000000 ffff88018683db28
[105670.434762] ffff88037df42f70 ffff88018683ffff 0000000000000246 000f424200000000
[105670.434785] Call Trace:
[105670.434795] [<ffffffff811bb0a3>] ? notesize.isra.11+0x13/0x30
[105670.434812] [<ffffffff811bbcbd>] elf_core_dump+0xbfd/0x1570
[105670.434828] [<ffffffff811bb913>] ? elf_core_dump+0x853/0x1570
[105670.434845] [<ffffffff811c2bc5>] ? do_coredump+0xe25/0xff0
[105670.434861] [<ffffffff810eb85d>] ? trace_hardirqs_on+0xd/0x10
[105670.434878] [<ffffffff8116e0ff>] ? __sb_start_write+0xdf/0x1b0
[105670.434894] [<ffffffff811c2bc5>] ? do_coredump+0xe25/0xff0
[105670.434911] [<ffffffff81097209>] ? unshare_files+0x29/0xa0
[105670.434926] [<ffffffff811c289c>] do_coredump+0xafc/0xff0
[105670.434943] [<ffffffff810a63c8>] ? __sigqueue_free+0x38/0x40
[105670.434960] [<ffffffff810a9961>] get_signal_to_deliver+0x1c1/0x5c0
[105670.434977] [<ffffffff810a8721>] ? do_send_sig_info+0x61/0x90
[105670.434994] [<ffffffff81002303>] do_signal+0x53/0x8e0
[105670.435008] [<ffffffff810a8cd0>] ? kill_pgrp+0x60/0x60
[105670.435025] [<ffffffff810c2b9e>] ? finish_task_switch+0x7e/0xe0
[105670.435043] [<ffffffff81799750>] ? sysret_signal+0x5/0x47
[105670.435058] [<ffffffff81002bef>] do_notify_resume+0x5f/0x70
[105670.435074] [<ffffffff812fc71e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[105670.435092] [<ffffffff817999e2>] int_signal+0x12/0x17
[105670.435106] Code: 48 89 e5 f6 82 40 c6 84 81 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 40 c6 84 81 20 75 f0 5d c3 66 0f 1f 44 00 00 31 c0 <80> 3f 00 55 48 89 e5 74 11 48 89 f8 66 90 48 83 c0 01 80 38 00
[105670.435238] RIP [<ffffffff812f7b42>] strlen+0x2/0x20
[105670.435254] RSP <ffff88018683d9f0>
[105670.435843] CR2: 0000000000000000
[105670.439699] ---[ end trace 9d67aee555e92d75 ]---

Martin

2013-08-29 22:03:28

by Greg Kroah-Hartman

[permalink] [raw]

Subject: Re: 3.10.9: Oops at elf_core_dump()

On Thu, Aug 29, 2013 at 11:46:18PM +0200, Martin MOKREJŠ wrote:
> Hi,
> I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
> file and looking for ELF gave me nothing. ;-)
>
> [105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
> [105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
> [105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
> [105670.434401] Oops: 0000 [#1] SMP
> [105670.434413] Modules linked in: iwldvm iwlwifi
> [105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8

Is this reproducable?

thanks,

greg k-h

2013-08-29 22:22:01

by Martin MOKREJŠ

[permalink] [raw]

Subject: Re: 3.10.9: Oops at elf_core_dump()

Got it for the first time. Actually, am doing something really unusual
(http://bugs.python.org/issue18843).

Am looking for an answer why I suffer memory corruption in python applicatuons.
So I installed DUMA from http://duma.sourceforge.net and tried to recompile&reinstall
failing python. In previous attempt it exited and per README instructions
I increased the vm.max_map_count value.

# export LD_PRELOAD=/usr/lib64/libduma.so.0.0.0
# sysctl -w vm.max_map_count=1000000
# emerge dev-lang/python:2.7
DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington <[email protected]>
Copyright (C) 2002-2008 Hayati Ayguen <[email protected]>, Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens <[email protected]>

* IMPORTANT: 11 news items need reading for repository 'gentoo'.
* Use eselect news to read news items.

* IMPORTANT: config file '5 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington <[email protected]>
Copyright (C) 2002-2008 Hayati Ayguen <[email protected]>, Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens <[email protected]>

' needs updating.
* See the CONFIGURATION FILES section of the emerge
* man page to learn how to update config files.
Calculating dependencies |
DUMA Aborting: mprotect() failed: Cannot allocate memory.
Check README section 'MEMORY USAGE AND EXECUTION SPEED'
if your (Linux) system may limit the number of different page mappings per process

[and it crashed, no ctrl+c working]

Sorry do not know what to say more. I just crashed teh kernel but except
the Ooops it works so far. The core filesize is zero.
Martin

Greg KH wrote:
> On Thu, Aug 29, 2013 at 11:46:18PM +0200, Martin MOKREJŠ wrote:
>> Hi,
>> I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
>> file and looking for ELF gave me nothing. ;-)
>>
>> [105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
>> [105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
>> [105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
>> [105670.434401] Oops: 0000 [#1] SMP
>> [105670.434413] Modules linked in: iwldvm iwlwifi
>> [105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8
>
> Is this reproducable?
>
> thanks,
>
> greg k-h
>

--
Martin Mokrejs, Ph.D.
Bioinformatics
Donovalska 1658
149 00 Prague
Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs

2013-08-29 23:34:38

by Martin MOKREJŠ

[permalink] [raw]

Subject: Re: 3.10.9: Oops at elf_core_dump()

So it happened again:

$ export LD_PRELOAD=/usr/lib64/libduma.so.0.0.0
$ python memory-corruption-test.py
DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington <[email protected]>
Copyright (C) 2002-2008 Hayati Ayguen <[email protected]>, Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens <[email protected]>

DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
Copyright (C) 2006 Michael Eddington <[email protected]>
Copyright (C) 2002-2008 Hayati Ayguen <[email protected]>, Procitec GmbH
Copyright (C) 1987-1999 Bruce Perens <[email protected]>

Finished one record
Finished one record
Finished one record
Finished one record
Finished one record
[cut]

DUMA Aborting: mprotect() failed: Cannot allocate memory.
Check README section 'MEMORY USAGE AND EXECUTION SPEED'
if your (Linux) system may limit the number of different page mappings per process
Fatal Python error: Illegal instruction

Current thread 0x00007fc9c4803740:
File "/usr/lib64/python2.7/site-packages/Bio/Blast/NCBIXML.py", line 106 in endElement
File "/mnt/1TB/var/tmp/portage/dev-lang/python-2.7.5-r2/work/Python-2.7.5/Modules/pyexpat.c", line 618 in EndElement
File "/usr/lib64/python2.7/site-packages/Bio/Blast/NCBIXML.py", line 654 in parse
File "memory-corruption-test.py", line 55 in doparse
File "memory-corruption-test.py", line 104 in main
File "memory-corruption-test.py", line 109 in <module>

The stacktrace is little different but ... I think I need to find what resource to so that
duma can keep running watching the python binary.

[112567.987073] BUG: unable to handle kernel NULL pointer dereference at (null)
[112567.987684] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
[112567.988282] PGD 28be2c067 PUD 3a7744067 PMD 0
[112567.988879] Oops: 0000 [#2] SMP
[112567.989468] Modules linked in: iwldvm iwlwifi
[112567.990057] CPU: 0 PID: 8822 Comm: python2.7 Tainted: G D 3.10.9-default-pciehp #8
[112567.990655] Hardware name: Dell Inc. Vostro 3550/, BIOS A11 08/03/2012
[112567.991249] task: ffff8803b5eb0fd0 ti: ffff8803b36d4000 task.ti: ffff8803b36d4000
[112567.991845] RIP: 0010:[<ffffffff812f7b42>] [<ffffffff812f7b42>] strlen+0x2/0x20
[112567.992443] RSP: 0018:ffff8803b36d59f0 EFLAGS: 00010246
[112567.993039] RAX: 0000000000000000 RBX: 0000000000000003 RCX: ffff8803b5eb0fd0
[112567.993643] RDX: 00000000016e3610 RSI: 0000000000000000 RDI: 0000000000000000
[112567.994249] RBP: ffff8803b36d5a08 R08: 00000000fffffffa R09: 0000000000000000
[112567.994854] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8803b36d5b00
[112567.995459] R13: 00007ffffffff000 R14: 0000000000000004 R15: 0000000000000000
[112567.996060] FS: 00007fc9c4803740(0000) GS:ffff88041d800000(0000) knlGS:0000000000000000
[112567.996664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[112567.997268] CR2: 0000000000000000 CR3: 00000003a3b84000 CR4: 00000000000407f0
[112567.997884] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[112567.998495] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
[112567.999100] Stack:
[112567.999698] ffffffff811bb0a3 ffff8803b36d5e60 00000000000003d8 ffff8803b36d5c08
[112568.000315] ffffffff811bbcbd ffffffff811bb913 0000000000000002 0000000000000000
[112568.000931] ffff8803b5eb0fd0 ffff8803b36dffff 0000000000000246 000f424200000000
[112568.001548] Call Trace:
[112568.002156] [<ffffffff811bb0a3>] ? notesize.isra.11+0x13/0x30
[112568.002774] [<ffffffff811bbcbd>] elf_core_dump+0xbfd/0x1570
[112568.003392] [<ffffffff811bb913>] ? elf_core_dump+0x853/0x1570
[112568.004012] [<ffffffff81097209>] ? unshare_files+0x29/0xa0
[112568.004629] [<ffffffff811c289c>] do_coredump+0xafc/0xff0
[112568.005247] [<ffffffff810a63c8>] ? __sigqueue_free+0x38/0x40
[112568.005865] [<ffffffff810a9961>] get_signal_to_deliver+0x1c1/0x5c0
[112568.006488] [<ffffffff810b5d90>] ? pid_vnr+0x30/0x30
[112568.007108] [<ffffffff81002303>] do_signal+0x53/0x8e0
[112568.007725] [<ffffffff81002bef>] do_notify_resume+0x5f/0x70
[112568.008342] [<ffffffff812fc71e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[112568.008965] [<ffffffff817999e2>] int_signal+0x12/0x17
[112568.009583] Code: 48 89 e5 f6 82 40 c6 84 81 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 40 c6 84 81 20 75 f0 5d c3 66 0f 1f 44 00 00 31 c0 <80> 3f 00 55 48 89 e5 74 11 48 89 f8 66 90 48 83 c0 01 80 38 00
[112568.011042] RIP [<ffffffff812f7b42>] strlen+0x2/0x20
[112568.011748] RSP <ffff8803b36d59f0>
[112568.012445] CR2: 0000000000000000
[112568.013155] ---[ end trace 9d67aee555e92d76 ]---

Martin MOKREJŠ wrote:
> Got it for the first time. Actually, am doing something really unusual
> (http://bugs.python.org/issue18843).
>
> Am looking for an answer why I suffer memory corruption in python applicatuons.
> So I installed DUMA from http://duma.sourceforge.net and tried to recompile&reinstall
> failing python. In previous attempt it exited and per README instructions
> I increased the vm.max_map_count value.
>
>
> # export LD_PRELOAD=/usr/lib64/libduma.so.0.0.0
> # sysctl -w vm.max_map_count=1000000
> # emerge dev-lang/python:2.7
> DUMA 2.5.15 (shared library, NO_LEAKDETECTION)
> Copyright (C) 2006 Michael Eddington <[email protected]>
> Copyright (C) 2002-2008 Hayati Ayguen <[email protected]>, Procitec GmbH
> Copyright (C) 1987-1999 Bruce Perens <[email protected]>
>
>
> * IMPORTANT: 11 news items need reading for repository 'gentoo'.
> * Use eselect news to read news items.
>
>
> * IMPORTANT: config file '5 (shared library, NO_LEAKDETECTION)
> Copyright (C) 2006 Michael Eddington <[email protected]>
> Copyright (C) 2002-2008 Hayati Ayguen <[email protected]>, Procitec GmbH
> Copyright (C) 1987-1999 Bruce Perens <[email protected]>
>
> ' needs updating.
> * See the CONFIGURATION FILES section of the emerge
> * man page to learn how to update config files.
> Calculating dependencies |
> DUMA Aborting: mprotect() failed: Cannot allocate memory.
> Check README section 'MEMORY USAGE AND EXECUTION SPEED'
> if your (Linux) system may limit the number of different page mappings per process
>
>
> [and it crashed, no ctrl+c working]
>
>
>
> Sorry do not know what to say more. I just crashed teh kernel but except
> the Ooops it works so far. The core filesize is zero.
> Martin
>
>
> Greg KH wrote:
>> On Thu, Aug 29, 2013 at 11:46:18PM +0200, Martin MOKREJŠ wrote:
>>> Hi,
>>> I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
>>> file and looking for ELF gave me nothing. ;-)
>>>
>>> [105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
>>> [105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
>>> [105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
>>> [105670.434401] Oops: 0000 [#1] SMP
>>> [105670.434413] Modules linked in: iwldvm iwlwifi
>>> [105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8
>>
>> Is this reproducable?
>>
>> thanks,
>>
>> greg k-h
>>
>

--
Martin Mokrejs, Ph.D.
Bioinformatics
Donovalska 1658
149 00 Prague
Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs

2013-08-30 06:57:16

by Dan Aloni

[permalink] [raw]

Subject: Re: 3.10.9: Oops at elf_core_dump()

On Thu, Aug 29, 2013 at 03:05:50PM -0700, Greg KH wrote:
> On Thu, Aug 29, 2013 at 11:46:18PM +0200, Martin MOKREJŠ wrote:
> > Hi,
> > I just got this stacktrace. Not sure whom to send it, poking throu MAINTAINERS
> > file and looking for ELF gave me nothing. ;-)
> >
> > [105670.434336] BUG: unable to handle kernel NULL pointer dereference at (null)
> > [105670.434366] IP: [<ffffffff812f7b42>] strlen+0x2/0x20
> > [105670.434385] PGD 18c8e5067 PUD 2b547e067 PMD 0
> > [105670.434401] Oops: 0000 [#1] SMP
> > [105670.434413] Modules linked in: iwldvm iwlwifi
> > [105670.434432] CPU: 0 PID: 7497 Comm: emerge Not tainted 3.10.9-default-pciehp #8
>
> Is this reproducable?

Yes, and here is my analysis:

fill_files_note(&info->files) exits early because of too many VM areas, or
due to memory pressure (vmalloc failing), leaving a NULL string in info->files,
letting notesize() crash on it.

as root do:

echo 300000 > /proc/sys/vm/max_map_count

then, as a regular user:

ulimit -c unlimited
gcc prog.c -o prog
./prog

prog.c:
-------
int main(int argc, char *argv[])
{
char *p, *t;
int i;

p = (void *)0x444400000000;

for (i = 0; i < 200000; i++) {
t = mmap(p, 0x1000, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_FIXED | MAP_ANONYMOUS | MAP_PRIVATE,
-1, 0);
p = &p[0x2000];
}

*((char *)0x0) = 0;

return 0;
}

And the result:

user@guestvm:~$ c[ 380.520865] BUG: unable to handle kernel NULL pointer dereference at 0000000000000086
[ 380.523196] IP: [<ffffffff812ee180>] strim+0x80/0x80
[ 380.524477] PGD 3abc6067 PUD 3c7b4067 PMD 0
[ 380.525974] Oops: 0000 [#1] SMP

Entering kdb (current=0xffff880033ee8000, pid 1716) on processor 0 Oops: (null)
due to oops @ 0xffffffff812ee180
dCPU: 0 PID: 1716 Comm: a.out Not tainted 3.10.9-mod-nodbg+ #1
dHardware name: Bochs Bochs, BIOS Bochs 01/01/2011
dtask: ffff880033ee8000 ti: ffff880034eec000 task.ti: ffff880034eec000
dRIP: 0010:[<ffffffff812ee180>] [<ffffffff812ee180>] strim+0x80/0x80
dRSP: 0000:ffff880034eeda30 EFLAGS: 00010292
dRAX: 0000000000c353c0 RBX: 00000000ffff8800 RCX: ffff880033ee8000
dRDX: 0000000000493f78 RSI: 00000000ffff8800 RDI: 0000000000000086
dRBP: ffff880034eeda48 R08: 00000000fffffffd R09: 0000000000000000
dR10: 0000000000000000 R11: ffffffff812e6c4e R12: ffff880034eedb78
dR13: 00007ffffffff000 R14: 0000000000000000 R15: ffffffff81802708
dFS: 00007fd9f8bf7740(0000) GS:ffff88003f200000(0000) knlGS:0000000000000000
dCS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
dCR2: 0000000000000086 CR3: 000000003ae91000 CR4: 00000000001407f0
dDR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
dDR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
dStack:
ffffffff811d29a5 ffff880033ee8000 00000000000003d8 ffff880034eedc38
ffffffff811d3579 ffff880034eeda88 ffffffff8108625a ffff88003a522300
ffff880033ee8000 0000000000493f78 0000ffff00030d51 ffff880000493f78
dCall Trace:
d [<ffffffff811d29a5>] ? notesize.isra.9+0x15/0x30
d [<ffffffff811d3579>] elf_core_dump+0xbb9/0x1460
d [<ffffffff8108625a>] ? finish_task_switch+0x4a/0x100
d [<ffffffff8164054d>] ? schedule+0x5d/0x60
d [<ffffffff81084a23>] ? __wake_up+0x53/0x70
d [<ffffffff811dbaee>] do_coredump+0xb8e/0xef0
d [<ffffffff8106632d>] ? __sigqueue_free+0x3d/0x50
d [<ffffffff81069bcf>] get_signal_to_deliver+0x53f/0x5d0
d [<ffffffff81637c03>] ? bad_area+0x44/0x4c
d [<ffffffff810123c7>] do_signal+0x57/0x570
d [<ffffffff8108cf0d>] ? __dequeue_entity+0x3d/0x50
d [<ffffffff81637eda>] ? printk+0x61/0x63
d [<ffffffff8108625a>] ? finish_task_switch+0x4a/0x100
d [<ffffffff8164030b>] ? __schedule+0x6bb/0x800
d [<ffffffff8101291e>] do_notify_resume+0x3e/0x90
d [<ffffffff81641b3c>] retint_signal+0x48/0x8c

On some systems the requirements for max_map_count are really large, so we can't
avoid it. So, binfmt_elf.c should be fixed.

--
Dan Aloni

2013-08-31 06:21:04

by Dan Aloni

[permalink] [raw]

Subject: [PATCH linux-next] Prevent a coredump with a large vm_map_count from Oopsing

A high setting of max_map_count, and a process core-dumping with
a large enough vm_map_count could result in an NT_FILE note not
being written, and the kernel crashing immediately later because
it has assumed otherwise.

Reproduction of the bug described here:

https://lkml.org/lkml/2013/8/30/50

Issue originating in 2aa362c49 (from Oct 4, 2012).

This patch make that section optional in that case.
fill_files_note() should signify the error, and also let the info
struct in elf_core_dump() be zero-initialized so that we can check
for the optionally written note.

Cc'ed original signers.

Cc'ed Al Viro because it is trivially relies on his linux-next
tree changes.

Signed-off-by: Dan Aloni <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
---
fs/binfmt_elf.c | 33 +++++++++++++++++++++------------
1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index dc82279..e1a323a 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1429,7 +1429,7 @@ static void fill_siginfo_note(struct memelfnote *note, user_siginfo_t *csigdata,
* long file_ofs
* followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL...
*/
-static void fill_files_note(struct memelfnote *note)
+static int fill_files_note(struct memelfnote *note)
{
struct vm_area_struct *vma;
unsigned count, size, names_ofs, remaining, n;
@@ -1444,11 +1444,11 @@ static void fill_files_note(struct memelfnote *note)
names_ofs = (2 + 3 * count) * sizeof(data[0]);
alloc:
if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
- goto err;
+ return -E2BIG;
size = round_up(size, PAGE_SIZE);
data = vmalloc(size);
if (!data)
- goto err;
+ return -ENOMEM;

start_end_ofs = data + 2;
name_base = name_curpos = ((char *)data) + names_ofs;
@@ -1501,7 +1501,7 @@ static void fill_files_note(struct memelfnote *note)

size = name_curpos - (char *)data;
fill_note(note, "CORE", NT_FILE, size, data);
- err: ;
+ return 0;
}

#ifdef CORE_DUMP_USE_REGSET
@@ -1623,6 +1623,7 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
struct elf_prpsinfo *psinfo;
struct core_thread *ct;
unsigned int i;
+ int ret;

info->size = 0;
info->thread = NULL;
@@ -1702,8 +1703,9 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
fill_auxv_note(&info->auxv, current->mm);
info->size += notesize(&info->auxv);

- fill_files_note(&info->files);
- info->size += notesize(&info->files);
+ ret = fill_files_note(&info->files);
+ if (!ret)
+ info->size += notesize(&info->files);

return 1;
}
@@ -1735,7 +1737,7 @@ static int write_note_info(struct elf_note_info *info,
return 0;
if (first && !writenote(&info->auxv, cprm))
return 0;
- if (first && !writenote(&info->files, cprm))
+ if (first && info->files.data && !writenote(&info->files, cprm))
return 0;

for (i = 1; i < info->thread_notes; ++i)
@@ -1822,6 +1824,7 @@ static int elf_dump_thread_status(long signr, struct elf_thread_status *t)

struct elf_note_info {
struct memelfnote *notes;
+ struct memelfnote *notes_files;
struct elf_prstatus *prstatus; /* NT_PRSTATUS */
struct elf_prpsinfo *psinfo; /* NT_PRPSINFO */
struct list_head thread_list;
@@ -1865,6 +1868,7 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
siginfo_t *siginfo, struct pt_regs *regs)
{
struct list_head *t;
+ int ret;

if (!elf_note_info_init(info))
return 0;
@@ -1912,9 +1916,13 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,

fill_siginfo_note(info->notes + 2, &info->csigdata, siginfo);
fill_auxv_note(info->notes + 3, current->mm);
- fill_files_note(info->notes + 4);
+ info->numnote = 4;

- info->numnote = 5;
+ ret = fill_files_note(info->notes + info->numnote);
+ if (!ret) {
+ info->notes_files = info->notes + info->numnote;
+ info->numnote++;
+ }

/* Try to dump the FPU. */
info->prstatus->pr_fpvalid = elf_core_copy_task_fpregs(current, regs,
@@ -1976,8 +1984,9 @@ static void free_note_info(struct elf_note_info *info)
kfree(list_entry(tmp, struct elf_thread_status, list));
}

- /* Free data allocated by fill_files_note(): */
- vfree(info->notes[4].data);
+ /* Free data possibly allocated by fill_files_note(): */
+ if (info->notes_files)
+ vfree(info->notes_files->data);

kfree(info->prstatus);
kfree(info->psinfo);
@@ -2059,7 +2068,7 @@ static int elf_core_dump(struct coredump_params *cprm)
struct vm_area_struct *vma, *gate_vma;
struct elfhdr *elf = NULL;
loff_t offset = 0, dataoff;
- struct elf_note_info info;
+ struct elf_note_info info = {0, };
struct elf_phdr *phdr4note = NULL;
struct elf_shdr *shdr4extnum = NULL;
Elf_Half e_phnum;
--
1.8.1.4

2013-08-31 13:39:11

by Martin MOKREJŠ

[permalink] [raw]

Subject: Re: [PATCH linux-next] Prevent a coredump with a large vm_map_count from Oopsing

Hi Dan,
thank you for your work on my issue. I would like to test it on 3.10.9 where
I faced the problem initially.

linux-3.10.9 # patch -p1 < ../patches/vm_map_count.patch
patching file fs/binfmt_elf.c
Hunk #1 succeeded at 1415 (offset -14 lines).
Hunk #2 succeeded at 1430 (offset -14 lines).
Hunk #3 succeeded at 1487 (offset -14 lines).
Hunk #4 succeeded at 1609 (offset -14 lines).
Hunk #5 succeeded at 1689 (offset -14 lines).
Hunk #6 FAILED at 1737.
Hunk #7 succeeded at 1810 (offset -14 lines).
Hunk #8 succeeded at 1854 (offset -14 lines).
Hunk #9 succeeded at 1902 (offset -14 lines).
Hunk #10 succeeded at 1970 (offset -14 lines).
Hunk #11 FAILED at 2068.
2 out of 11 hunks FAILED -- saving rejects to file fs/binfmt_elf.c.rej
#

Thank you.

Dan Aloni wrote:
> A high setting of max_map_count, and a process core-dumping with
> a large enough vm_map_count could result in an NT_FILE note not
> being written, and the kernel crashing immediately later because
> it has assumed otherwise.
>
> Reproduction of the bug described here:
>
> https://lkml.org/lkml/2013/8/30/50
>
> Issue originating in 2aa362c49 (from Oct 4, 2012).
>
> This patch make that section optional in that case.
> fill_files_note() should signify the error, and also let the info
> struct in elf_core_dump() be zero-initialized so that we can check
> for the optionally written note.
>
> Cc'ed original signers.
>
> Cc'ed Al Viro because it is trivially relies on his linux-next
> tree changes.
>
> Signed-off-by: Dan Aloni <[email protected]>
> Cc: Al Viro <[email protected]>
> Cc: Denys Vlasenko <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> ---
> fs/binfmt_elf.c | 33 +++++++++++++++++++++------------
> 1 file changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index dc82279..e1a323a 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -1429,7 +1429,7 @@ static void fill_siginfo_note(struct memelfnote *note, user_siginfo_t *csigdata,
> * long file_ofs
> * followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL...
> */
> -static void fill_files_note(struct memelfnote *note)
> +static int fill_files_note(struct memelfnote *note)
> {
> struct vm_area_struct *vma;
> unsigned count, size, names_ofs, remaining, n;
> @@ -1444,11 +1444,11 @@ static void fill_files_note(struct memelfnote *note)
> names_ofs = (2 + 3 * count) * sizeof(data[0]);
> alloc:
> if (size >= MAX_FILE_NOTE_SIZE) /* paranoia check */
> - goto err;
> + return -E2BIG;
> size = round_up(size, PAGE_SIZE);
> data = vmalloc(size);
> if (!data)
> - goto err;
> + return -ENOMEM;
>
> start_end_ofs = data + 2;
> name_base = name_curpos = ((char *)data) + names_ofs;
> @@ -1501,7 +1501,7 @@ static void fill_files_note(struct memelfnote *note)
>
> size = name_curpos - (char *)data;
> fill_note(note, "CORE", NT_FILE, size, data);
> - err: ;
> + return 0;
> }
>
> #ifdef CORE_DUMP_USE_REGSET
> @@ -1623,6 +1623,7 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
> struct elf_prpsinfo *psinfo;
> struct core_thread *ct;
> unsigned int i;
> + int ret;
>
> info->size = 0;
> info->thread = NULL;
> @@ -1702,8 +1703,9 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
> fill_auxv_note(&info->auxv, current->mm);
> info->size += notesize(&info->auxv);
>
> - fill_files_note(&info->files);
> - info->size += notesize(&info->files);
> + ret = fill_files_note(&info->files);
> + if (!ret)
> + info->size += notesize(&info->files);
>
> return 1;
> }
> @@ -1735,7 +1737,7 @@ static int write_note_info(struct elf_note_info *info,
> return 0;
> if (first && !writenote(&info->auxv, cprm))
> return 0;
> - if (first && !writenote(&info->files, cprm))
> + if (first && info->files.data && !writenote(&info->files, cprm))
> return 0;
>
> for (i = 1; i < info->thread_notes; ++i)
> @@ -1822,6 +1824,7 @@ static int elf_dump_thread_status(long signr, struct elf_thread_status *t)
>
> struct elf_note_info {
> struct memelfnote *notes;
> + struct memelfnote *notes_files;
> struct elf_prstatus *prstatus; /* NT_PRSTATUS */
> struct elf_prpsinfo *psinfo; /* NT_PRPSINFO */
> struct list_head thread_list;
> @@ -1865,6 +1868,7 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
> siginfo_t *siginfo, struct pt_regs *regs)
> {
> struct list_head *t;
> + int ret;
>
> if (!elf_note_info_init(info))
> return 0;
> @@ -1912,9 +1916,13 @@ static int fill_note_info(struct elfhdr *elf, int phdrs,
>
> fill_siginfo_note(info->notes + 2, &info->csigdata, siginfo);
> fill_auxv_note(info->notes + 3, current->mm);
> - fill_files_note(info->notes + 4);
> + info->numnote = 4;
>
> - info->numnote = 5;
> + ret = fill_files_note(info->notes + info->numnote);
> + if (!ret) {
> + info->notes_files = info->notes + info->numnote;
> + info->numnote++;
> + }
>
> /* Try to dump the FPU. */
> info->prstatus->pr_fpvalid = elf_core_copy_task_fpregs(current, regs,
> @@ -1976,8 +1984,9 @@ static void free_note_info(struct elf_note_info *info)
> kfree(list_entry(tmp, struct elf_thread_status, list));
> }
>
> - /* Free data allocated by fill_files_note(): */
> - vfree(info->notes[4].data);
> + /* Free data possibly allocated by fill_files_note(): */
> + if (info->notes_files)
> + vfree(info->notes_files->data);
>
> kfree(info->prstatus);
> kfree(info->psinfo);
> @@ -2059,7 +2068,7 @@ static int elf_core_dump(struct coredump_params *cprm)
> struct vm_area_struct *vma, *gate_vma;
> struct elfhdr *elf = NULL;
> loff_t offset = 0, dataoff;
> - struct elf_note_info info;
> + struct elf_note_info info = {0, };
> struct elf_phdr *phdr4note = NULL;
> struct elf_shdr *shdr4extnum = NULL;
> Elf_Half e_phnum;
>

--
Martin Mokrejs, Ph.D.
Bioinformatics
Donovalska 1658
149 00 Prague
Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs

2013-08-31 13:51:31

by Dan Aloni

[permalink] [raw]

Subject: Re: [PATCH linux-next] Prevent a coredump with a large vm_map_count from Oopsing

On Sat, Aug 31, 2013 at 03:38:33PM +0200, Martin MOKREJŠ wrote:
> Hi Dan,
> thank you for your work on my issue. I would like to test it on 3.10.9 where
> I faced the problem initially.

Sure, see the attached patch for 3.10.9.

--
Dan Aloni

Attachments:

(No filename) (246.00 B)
0001-Prevent-a-coredump-with-a-large-max_map_count-from-O.patch (4.82 kB)
Download all attachments

2013-09-01 00:13:45

by Martin MOKREJŠ

[permalink] [raw]

Subject: Re: [PATCH linux-next] Prevent a coredump with a large vm_map_count from Oopsing

Dan Aloni wrote:
> On Sat, Aug 31, 2013 at 03:38:33PM +0200, Martin MOKREJŠ wrote:
>> Hi Dan,
>> thank you for your work on my issue. I would like to test it on 3.10.9 where
>> I faced the problem initially.
>
> Sure, see the attached patch for 3.10.9.

Thanks, it works for my case. You can add my Reported-by: and Tested-by:. ;-)