Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754495Ab1EaEPP (ORCPT ); Tue, 31 May 2011 00:15:15 -0400 Received: from mx4-phx2.redhat.com ([209.132.183.25]:52521 "EHLO mx4-phx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750830Ab1EaEPN (ORCPT ); Tue, 31 May 2011 00:15:13 -0400 Date: Tue, 31 May 2011 00:14:32 -0400 (EDT) From: CAI Qian To: KOSAKI Motohiro Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, hughd@google.com, kamezawa hiroyu , minchan kim , oleg@redhat.com Message-ID: <1582158305.317043.1306815272554.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> In-Reply-To: <4DE46A4B.40401@jp.fujitsu.com> Subject: Re: [PATCH v2 0/5] Fix oom killer doesn't work at all if system have > gigabytes memory (aka CAI founded issue) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.5.5.71] X-Mailer: Zimbra 6.0.9_GA_2686 (ZimbraWebClient - FF3.0 (Linux)/6.0.9_GA_2686) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6213 Lines: 125 ----- Original Message ----- > (2011/05/31 10:33), CAI Qian wrote: > > Hello, > > > > Have tested those patches rebased from KOSAKI for the latest > > mainline. > > It still killed random processes and recevied a panic at the end by > > using root user. The full oom output can be found here. > > http://people.redhat.com/qcai/oom > > You ran fork-bomb as root. Therefore unprivileged process was killed > at first. > It's no random. It's intentional and desirable. I mean > > - If you run the same progream as non-root, python will be killed at > first. > Because it consume a lot of memory than daemons. > - If you run the same program as root, non root process and privilege > explicit > dropping processes (e.g. irqbalance) will be killed at first. > > > Look, your log says, highest oom score process was killed first. > > Out of memory: Kill process 5462 (abrtd) points:393 total-vm:262300kB, > anon-rss:1024kB, file-rss:0kB > Out of memory: Kill process 5277 (hald) points:303 total-vm:25444kB, > anon-rss:1116kB, file-rss:0kB > Out of memory: Kill process 5720 (sshd) points:258 total-vm:97684kB, > anon-rss:824kB, file-rss:0kB > Out of memory: Kill process 5457 (pickup) points:236 total-vm:78672kB, > anon-rss:768kB, file-rss:0kB > Out of memory: Kill process 5451 (master) points:235 total-vm:78592kB, > anon-rss:796kB, file-rss:0kB > Out of memory: Kill process 5458 (qmgr) points:233 total-vm:78740kB, > anon-rss:764kB, file-rss:0kB > Out of memory: Kill process 5353 (sshd) points:189 total-vm:63992kB, > anon-rss:620kB, file-rss:0kB > Out of memory: Kill process 1626 (dhclient) points:129 > total-vm:9148kB, anon-rss:484kB, file-rss:0kB OK, there was also a panic at the end. Is that expected? BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8 IP: [] get_mm_counter+0x14/0x30 PGD 0 Oops: 0000 [#1] SMP CPU 7 Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 dm_mirror dm_region_hash dm_log microcode serio_raw pcspkr cdc_ether usbnet mii i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support sg shpchp ioatdma dca i7core_edac edac_core bnx2 ext4 mbcache jbd2 sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas dm_mod [last unloaded: scsi_wait_scan] Pid: 5232, comm: dbus-daemon Not tainted 3.0.0-rc1+ #3 IBM System x3550 M3 -[7944I21]-/69Y4438 RIP: 0010:[] [] get_mm_counter+0x14/0x30 RSP: 0000:ffff88027116b828 EFLAGS: 00010286 RAX: 00000000000002a0 RBX: ffff880470cd8a80 RCX: 0000000000000003 RDX: 000000000000000e RSI: 0000000000000002 RDI: 0000000000000000 RBP: ffff88027116b828 R08: 0000000000000000 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000007 R12: ffff88027116b880 R13: 0000000000000000 R14: 0000000000000000 R15: ffff880270df2100 FS: 00007f78a3837700(0000) GS:ffff88047fc60000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000000002a8 CR3: 000000047238f000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dbus-daemon (pid: 5232, threadinfo ffff88027116a000, task ffff880270df2100) Stack: ffff88027116b8b8 ffffffff81104c60 0000000000000000 0000000000000000 ffff8802704c4680 0000000000000000 ffff8802705161c0 0000000000000000 0000000000000000 0000000000000000 0000000000000286 ffff880470cd8e98 Call Trace: [] dump_tasks+0xa0/0x160 [] dump_header+0xb5/0xd0 [] oom_kill_process+0xa5/0x1c0 [] out_of_memory+0xff/0x220 [] __alloc_pages_slowpath+0x632/0x6b0 [] __alloc_pages_nodemask+0x1a4/0x1f0 [] kmem_getpages+0x62/0x170 [] fallback_alloc+0x1ba/0x270 [] ? cache_grow+0x2c3/0x2f0 [] ____cache_alloc_node+0x95/0x150 [] kmem_cache_alloc+0xfd/0x190 [] taskstats_exit+0x1cd/0x240 [] do_exit+0x177/0x430 [] do_group_exit+0x51/0xc0 [] get_signal_to_deliver+0x203/0x470 [] do_signal+0x69/0x190 [] do_notify_resume+0x65/0x80 [] int_signal+0x12/0x17 Code: 48 8b 00 c9 48 d1 e8 83 e0 01 c3 0f 1f 40 00 31 c0 c9 c3 0f 1f 40 00 55 48 89 e5 66 66 66 66 90 48 63 f6 48 8d 84 f7 90 02 00 00 8b 50 08 31 c0 c9 48 85 d2 48 0f 49 c2 c3 66 66 66 66 2e 0f RIP [] get_mm_counter+0x14/0x30 RSP CR2: 00000000000002a8 ---[ end trace 742b26ee0c4fab73 ]--- Fixing recursive fault but reboot is needed! Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0 Pid: 4, comm: kworker/0:0 Tainted: G D 3.0.0-rc1+ #3 Call Trace: [] panic+0x91/0x1a8 [] watchdog_overflow_callback+0xb1/0xc0 [] __perf_event_overflow+0x9d/0x250 [] perf_event_overflow+0x14/0x20 [] intel_pmu_handle_irq+0x326/0x530 [] perf_event_nmi_handler+0x29/0xa0 [] notifier_call_chain+0x55/0x80 [] atomic_notifier_call_chain+0x1a/0x20 [] notify_die+0x2e/0x30 [] default_do_nmi+0x39/0x1f0 [] do_nmi+0x80/0xa0 [] nmi+0x20/0x30 [] ? __write_lock_failed+0x9/0x20 <> [] ? _raw_write_lock_irq+0x1e/0x20 [] forget_original_parent+0x3c/0x330 [] exit_notify+0x1b/0x190 [] do_exit+0x1fd/0x430 [] ? manage_workers+0x120/0x120 [] kthread+0x8e/0xa0 [] kernel_thread_helper+0x4/0x10 [] ? kthread_worker_fn+0x1a0/0x1a0 [] ? gs_change+0x13/0x13 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/