Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1763649AbYARTzE (ORCPT ); Fri, 18 Jan 2008 14:55:04 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1761968AbYARTyy (ORCPT ); Fri, 18 Jan 2008 14:54:54 -0500 Received: from palinux.external.hp.com ([192.25.206.14]:50354 "EHLO mail.parisc-linux.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761915AbYARTyx (ORCPT ); Fri, 18 Jan 2008 14:54:53 -0500 Date: Fri, 18 Jan 2008 12:54:52 -0700 From: Matthew Wilcox To: Christoph Lameter Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: SLUB: Increasing partial pages Message-ID: <20080118195451.GE20490@parisc-linux.org> References: <20080116195949.GO18741@parisc-linux.org> <20080116214127.GA11559@parisc-linux.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.13 (2006-08-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 15083 Lines: 315 On Wed, Jan 16, 2008 at 02:01:08PM -0800, Christoph Lameter wrote: > Dec 6th? I was on vacation then and it seems that I was unable to > reproduce the oopses. Can I get some backtraces or other information > that would allow me to diagnose the problem? I've found one backtrace which seems to be relevant. I believe this is due to 8/10. I don't think the kernel for this run was saved, so we'll have to do a new run if you need more information than this. general protection fault: 0000 [1] SMP CPU 5 Modules linked in: qla2xxx scsi_transport_fc ipv6 dm_mirror dm_multipath dm_mod button ehci_hcd uhci_hcd i2c_i801 i2c_core r ng_core e1000 floppy ext3 jbd raid1 aic79xx scsi_transport_spi ata_piix ahci libata sd_mod scsi_mod Pid: 22541, comm: oracle Not tainted 2.6.24-rc1-slub2 #3 RIP: 0010:[] [] kmem_cache_alloc+0x3f/0x69 RSP: 0018:ffff81006d551b38 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff810001084d40 RCX: ffffffff80267faf RDX: 00ffff8100745818 RSI: 0000000000011220 RDI: ffffffff805b2240 RBP: 0000000000011220 R08: 0000000000000000 R09: ffff8100745e8a80 R10: 0000000000000086 R11: ffffffff88008014 R12: 0000000000000000 R13: 0000000000011220 R14: 0000000000011220 R15: 0000000000000001 FS: 00002b0b43b53c60(0000) GS:ffff81010609d880(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000f56626ff0 CR3: 000000006d4e4000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process oracle (pid: 22541, threadinfo ffff81006d550000, task ffff810069c3c790) Stack: 0000000000000000 ffff8100745e8a80 ffff810109979300 ffffffff80267faf ffff810106198000 ffffffff8023df1d 0000000000000000 ffff81010bdeee88 0000000000000086 0000000000000086 ffff8100745e8a80 0000000000000000 Call Trace: [] mempool_alloc+0x41/0xf9 [] lock_timer_base+0x24/0x49 [] :scsi_mod:scsi_alloc_sgtable+0xb6/0x19c [] :scsi_mod:scsi_init_io+0x26/0xda [] :sd_mod:sd_prep_fn+0x70/0x782 [] elv_next_request+0xed/0x13a [] :scsi_mod:scsi_request_fn+0x0/0x362 [] :scsi_mod:scsi_request_fn+0x77/0x362 [] generic_unplug_device+0x18/0x25 [] __blockdev_direct_IO+0x851/0xa34 [] blkdev_direct_IO+0x45/0x4a [] blkdev_get_blocks+0x0/0x80 [] generic_file_direct_IO+0xd0/0x103 [] generic_file_aio_read+0x86/0x160 [] __aio_get_req+0x1f/0x141 [] generic_file_aio_read+0x0/0x160 [] aio_rw_vect_retry+0x75/0x171 [] aio_rw_vect_retry+0x0/0x171 [] aio_run_iocb+0x5d/0xe3 [] io_submit_one+0x333/0x379 [] sys_io_submit+0xb1/0xfe [] system_call+0x7e/0x83 Code: 48 8b 34 c2 48 89 d0 48 0f b1 33 48 39 d0 75 ce 81 e5 00 80 RIP [] kmem_cache_alloc+0x3f/0x69 RSP BUG: sleeping function called from invalid context at kernel/rwsem.c:20 in_atomic():0, irqs_disabled():1 Call Trace: [] __might_sleep+0xb5/0xb7 [] down_read+0x15/0x1e [] acct_collect+0x40/0x17d [] do_exit+0x209/0x77d [] do_divide_error+0x0/0x89 [] error_exit+0x0/0x51 [] :scsi_mod:scsi_request_fn+0x0/0x362 [] mempool_alloc+0x41/0xf9 [] kmem_cache_alloc+0x3f/0x69 [] mempool_alloc+0x41/0xf9 [] lock_timer_base+0x24/0x49 [] :scsi_mod:scsi_alloc_sgtable+0xb6/0x19c [] :scsi_mod:scsi_init_io+0x26/0xda [] :sd_mod:sd_prep_fn+0x70/0x782 [] elv_next_request+0xed/0x13a [] :scsi_mod:scsi_request_fn+0x0/0x362 [] :scsi_mod:scsi_request_fn+0x77/0x362 [] generic_unplug_device+0x18/0x25 [] __blockdev_direct_IO+0x851/0xa34 [] blkdev_direct_IO+0x45/0x4a [] blkdev_get_blocks+0x0/0x80 [] generic_file_direct_IO+0xd0/0x103 [] generic_file_aio_read+0x86/0x160 [] __aio_get_req+0x1f/0x141 [] generic_file_aio_read+0x0/0x160 [] aio_rw_vect_retry+0x75/0x171 [] aio_rw_vect_retry+0x0/0x171 [] aio_run_iocb+0x5d/0xe3 [] io_submit_one+0x333/0x379 [] sys_io_submit+0xb1/0xfe [] system_call+0x7e/0x83 general protection fault: 0000 [2] SMP CPU 5 Modules linked in: qla2xxx scsi_transport_fc ipv6 dm_mirror dm_multipath dm_mod button ehci_hcd uhci_hcd i2c_i801 i2c_core r ng_core e1000 floppy ext3 jbd raid1 aic79xx scsi_transport_spi ata_piix ahci libata sd_mod scsi_mod Pid: 22541, comm: oracle Tainted: G D 2.6.24-rc1-slub2 #3 RIP: 0010:[] [] kmem_cache_alloc_node+0x46/0x70 RSP: 0018:ffff8101061a7e50 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff810001084d40 RCX: ffffffff803f3f43 RDX: 00000000ffffffff RSI: 00ffff8100745818 RDI: ffffffff805b2240 RBP: 0000000000000020 R08: 0000000000000005 R09: 000000000000b031 R10: ffff81006d5518f8 R11: ffff81006c14caa0 R12: 00000000000000e0 R13: ffffffff805b2240 R14: 0000000000000020 R15: 0000000000000001 FS: 00002b0b43b53c60(0000) GS:ffff81010609d880(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000f56626ff0 CR3: 000000006d4e4000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process oracle (pid: 22541, threadinfo ffff81006d550000, task ffff810069c3c790) Stack: 00000000ffffffff ffff81006c17e000 00000000ffffffff ffffffff803f3f43 0000000000000005 ffff81006c17e000 ffff810106198000 0000000000000100 ffffffff80431824 ffff8101061a7ee0 0000000000000001 ffffffff8043125a Call Trace: [] __alloc_skb+0x36/0x12a [] tcp_delack_timer+0x0/0x1e2 [] tcp_send_ack+0x24/0xfe [] tcp_delack_timer+0x185/0x1e2 [] run_timer_softirq+0x156/0x1ac [] __do_softirq+0x50/0xbb [] call_softirq+0x1c/0x28 [] do_softirq+0x2e/0x96 [] smp_apic_timer_interrupt+0x3e/0x51 [] apic_timer_interrupt+0x66/0x70 [] vgacon_cursor+0x0/0x1b1 [] _spin_unlock_irq+0x9/0xa [] __down_read+0x34/0x9e [] __might_sleep+0xb5/0xb7 [] acct_collect+0x40/0x17d [] do_exit+0x209/0x77d [] do_divide_error+0x0/0x89 [] error_exit+0x0/0x51 [] :scsi_mod:scsi_request_fn+0x0/0x362 [] mempool_alloc+0x41/0xf9 [] kmem_cache_alloc+0x3f/0x69 [] mempool_alloc+0x41/0xf9 [] lock_timer_base+0x24/0x49 [] :scsi_mod:scsi_alloc_sgtable+0xb6/0x19c [] :scsi_mod:scsi_init_io+0x26/0xda [] :sd_mod:sd_prep_fn+0x70/0x782 [] elv_next_request+0xed/0x13a [] :scsi_mod:scsi_request_fn+0x0/0x362 [] :scsi_mod:scsi_request_fn+0x77/0x362 [] generic_unplug_device+0x18/0x25 [] __blockdev_direct_IO+0x851/0xa34 [] blkdev_direct_IO+0x45/0x4a [] blkdev_get_blocks+0x0/0x80 [] generic_file_direct_IO+0xd0/0x103 [] generic_file_aio_read+0x86/0x160 [] __aio_get_req+0x1f/0x141 [] generic_file_aio_read+0x0/0x160 [] aio_rw_vect_retry+0x75/0x171 [] aio_rw_vect_retry+0x0/0x171 [] aio_run_iocb+0x5d/0xe3 [] io_submit_one+0x333/0x379 [] sys_io_submit+0xb1/0xfe [] system_call+0x7e/0x83 Code: 4c 8b 04 c6 48 89 f0 4c 0f b1 03 48 39 f0 75 c6 81 e5 00 80 RIP [] kmem_cache_alloc_node+0x46/0x70 RSP Kernel panic - not syncing: Aiee, killing interrupt handler! BUG: spinlock lockup on CPU#2, oracle/22999, ffff81010bdef018 Call Trace: [] _raw_spin_lock+0xcc/0xf3 [] __make_request+0x65/0x574 [] mempool_alloc+0x41/0xf9 [] generic_make_request+0x1d1/0x206 [] bio_alloc_bioset+0xcd/0x136 [] submit_bio+0xd4/0xdf [] dio_bio_submit+0x52/0x66 [] __blockdev_direct_IO+0x827/0xa34 [] blkdev_direct_IO+0x45/0x4a [] blkdev_get_blocks+0x0/0x80 [] generic_file_direct_IO+0xd0/0x103 [] generic_file_aio_read+0x86/0x160 [] __aio_get_req+0x1f/0x141 [] generic_file_aio_read+0x0/0x160 [] aio_rw_vect_retry+0x75/0x171 [] aio_rw_vect_retry+0x0/0x171 [] aio_run_iocb+0x5d/0xe3 [] io_submit_one+0x333/0x379 [] sys_io_submit+0xb1/0xfe [] system_call+0x7e/0x83 BUG: spinlock lockup on CPU#7, ksoftirqd/7/25, ffff81010bdef018 Call Trace: [] _raw_spin_lock+0xcc/0xf3 [] :scsi_mod:scsi_device_unbusy+0x66/0x80 [] :scsi_mod:scsi_finish_command+0x16/0xbf [] :scsi_mod:scsi_softirq_done+0xdd/0xe4 [] kobject_release+0x0/0x9 [] kobject_release+0x0/0x9 [] blk_done_softirq+0x63/0x71 [] __do_softirq+0x50/0xbb [] call_softirq+0x1c/0x28 [] ksoftirqd+0x0/0x9b [] do_softirq+0x2e/0x96 [] ksoftirqd+0x0/0x9b [] ksoftirqd+0x45/0x9b [] kthread+0x3d/0x63 [] child_rip+0xa/0x12 [] kthread+0x0/0x63 [] child_rip+0x0/0x12 BUG: spinlock lockup on CPU#1, ksoftirqd/1/7, ffff81010bdef018 Call Trace: BUG: spinlock lockup on CPU#3, oracle/22971, ffff81010bdef018 Call Trace: [] _raw_spin_lock+0xcc/0xf3 [] __make_request+0x65/0x574 [] mempool_alloc+0x41/0xf9 [] generic_make_request+0x1d1/0x206 [] bio_alloc_bioset+0xcd/0x136 [] submit_bio+0xd4/0xdf [] dio_bio_submit+0x52/0x66 [] __blockdev_direct_IO+0x827/0xa34 [] blkdev_direct_IO+0x45/0x4a [] blkdev_get_blocks+0x0/0x80 [] generic_file_direct_IO+0xd0/0x103 [] generic_file_aio_read+0x86/0x160 [] __aio_get_req+0x1f/0x141 [] generic_file_aio_read+0x0/0x160 [] aio_rw_vect_retry+0x75/0x171 [] aio_rw_vect_retry+0x0/0x171 [] aio_run_iocb+0x5d/0xe3 [] io_submit_one+0x333/0x379 [] sys_io_submit+0xb1/0xfe [] system_call+0x7e/0x83 [] _raw_spin_lock+0xcc/0xf3 [] :scsi_mod:scsi_device_unbusy+0x66/0x80 [] :scsi_mod:scsi_finish_command+0x16/0xbf [] :scsi_mod:scsi_softirq_done+0xdd/0xe4 [] kobject_release+0x0/0x9 [] kobject_release+0x0/0x9 [] blk_done_softirq+0x63/0x71 [] __do_softirq+0x50/0xbb [] call_softirq+0x1c/0x28 [] ksoftirqd+0x0/0x9b [] do_softirq+0x2e/0x96 [] ksoftirqd+0x0/0x9b [] ksoftirqd+0x45/0x9b [] kthread+0x3d/0x63 [] child_rip+0xa/0x12 [] kthread+0x0/0x63 [] child_rip+0x0/0x12 BUG: spinlock lockup on CPU#6, oracle/22645, ffff81010bdef018 Call Trace: [] _raw_spin_lock+0xcc/0xf3 [] __make_request+0x65/0x574 [] mempool_alloc+0x41/0xf9 [] generic_make_request+0x1d1/0x206 [] bio_alloc_bioset+0xcd/0x136 [] submit_bio+0xd4/0xdf [] dio_bio_submit+0x52/0x66 [] __blockdev_direct_IO+0x827/0xa34 [] blkdev_direct_IO+0x45/0x4a [] blkdev_get_blocks+0x0/0x80 [] generic_file_direct_IO+0xd0/0x103 [] generic_file_aio_read+0x86/0x160 [] __aio_get_req+0x1f/0x141 [] generic_file_aio_read+0x0/0x160 [] aio_rw_vect_retry+0x75/0x171 BUG: spinlock lockup on CPU#4, oracle/22730, ffff81010bdef018 Call Trace: [] _raw_spin_lock+0xcc/0xf3 [] __make_request+0x65/0x574 [] mempool_alloc+0x41/0xf9 [] generic_make_request+0x1d1/0x206 [] bio_alloc_bioset+0xcd/0x136 [] submit_bio+0xd4/0xdf [] dio_bio_submit+0x52/0x66 [] __blockdev_direct_IO+0x827/0xa34 [] blkdev_direct_IO+0x45/0x4a [] blkdev_get_blocks+0x0/0x80 [] generic_file_direct_IO+0xd0/0x103 [] generic_file_aio_read+0x86/0x160 [] __aio_get_req+0x1f/0x141 [] generic_file_aio_read+0x0/0x160 [] aio_rw_vect_retry+0x75/0x171 [] aio_rw_vect_retry+0x0/0x171 [] aio_run_iocb+0x5d/0xe3 [] io_submit_one+0x333/0x379 [] sys_io_submit+0xb1/0xfe [] system_call+0x7e/0x83 [] aio_rw_vect_retry+0x0/0x171 [] aio_run_iocb+0x5d/0xe3 [] io_submit_one+0x333/0x379 [] sys_io_submit+0xb1/0xfe [] system_call+0x7e/0x83 -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/