Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Sun, 10 Nov 2002 09:25:57 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Sun, 10 Nov 2002 09:25:57 -0500 Received: from 10fwd.cistron-office.nl ([62.216.29.197]:18654 "EHLO smtp.cistron-office.nl") by vger.kernel.org with ESMTP id ; Sun, 10 Nov 2002 09:25:55 -0500 Date: Sun, 10 Nov 2002 15:32:36 +0100 From: Miquel van Smoorenburg To: Andrew Morton Cc: linux-kernel@vger.kernel.org Subject: Re: 2.5.46: kernel BUG at kernel/timer.c:333! Message-ID: <20021110153236.A18563@cistron.nl> References: <3DCD5917.FEEA7C5D@digeo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3DCD5917.FEEA7C5D@digeo.com>; from akpm@digeo.com on Sat, Nov 09, 2002 at 10:51:03AM -0800 X-NCC-RegID: nl.cistron Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3639 Lines: 97 According to Andrew Morton: > Miquel van Smoorenburg wrote: > > I can reliably crash 2.5.X on one of our newsservers (dual PIII/450, GigE, > > lots of disk- and network I/O). > > > > kernel BUG at kernel/timer.c:333! > > There are timer fixes in Linus's current tree. The problem which > they address could cause this BUG. I've booted 2.5.46bk5 on the machine, and it has been running for over 2 hours with extra heavy diskio. That reliably crashed the machine in about 45 minutes with 2.4.45 and 2.5.46, machine is still up now. > > Debug: sleeping function called from illegal context at include/asm/semaphore.h:119 > > Call Trace: > > [] __might_sleep+0x54/0x58 > > [] set_shrinker+0x3c/0x7c > > [] mb_cache_create+0x1c4/0x244 > > [] mb_cache_shrink_fn+0x0/0x170 > > [] init+0x47/0x1ac > > [] init+0x0/0x1ac > > [] kernel_thread_helper+0x5/0xc > > That's different. A fix for this is in Linus's tree. Indeed, I don't see that one anymore in -bk5 I'm still seeing the buffer layer error at fs/buffer.c:1623, though. Happens when a blockdev is close()d. Is a fix for this in -mm2? Does -mm2 include -bk5 ? If so I'll put that on it and keep an eye on it tomorrow, see what happens. Debug messages I'm still seeing: (note that I compiled IPv6 into the kernel since we're slowly moving our network to IPv6 but that it is otherwise unused right now, and that the previous kernels that crashed on me didn't have IPv6 in it) Uninitialised timer! This is just a warning. Your computer is OK function=0xc0285748, data=0xf78a6680 Call Trace: [] check_timer_failed+0x40/0x54 [] igmp6_timer_handler+0x0/0x58 [] del_timer+0x16/0x84 [] igmp6_join_group+0x94/0x124 [] igmp6_group_added+0xcc/0xd8 [] tcp_v6_err+0x3a7/0x63c [] ipv6_dev_mc_inc+0x31e/0x330 [] addrconf_join_solict+0x3a/0x44 [] addrconf_dad_start+0x13/0x15c [] addrconf_add_linklocal+0x28/0x44 [] addrconf_dev_config+0x97/0xa4 [] addrconf_notify+0x52/0xc0 [] notifier_call_chain+0x1f/0x38 [] dev_open+0xa6/0xb0 [] dev_change_flags+0x51/0x104 [] devinet_ioctl+0x2bc/0x598 [] inet_ioctl+0x77/0xb4 [] sock_ioctl+0x267/0x298 [] sys_ioctl+0x22d/0x27c [] error_code+0x2d/0x38 [] syscall_call+0x7/0xb buffer layer error at fs/buffer.c:1623 Pass this trace through ksymoops for reporting Call Trace: [] __buffer_error+0x33/0x38 [] __block_write_full_page+0x7f/0x3bc [] block_write_full_page+0x2d/0x9c [] blkdev_get_block+0x0/0x48 [] blkdev_writepage+0xf/0x14 [] blkdev_get_block+0x0/0x48 [] mpage_writepages+0x21b/0x398 [] blkdev_writepage+0x0/0x14 [] __pagevec_free+0x1f/0x28 [] release_pages+0x171/0x17c [] generic_writepages+0x11/0x15 [] do_writepages+0x18/0x30 [] filemap_fdatawrite+0x51/0x68 [] sync_blockdev+0x1b/0x3c [] blkdev_put+0x7b/0x19c [] blkdev_close+0x12/0x18 [] __fput+0x30/0xf8 [] fput+0x14/0x18 [] filp_close+0xb0/0xbc [] sys_close+0x76/0xa4 [] syscall_call+0x7/0xb Thanks, Mike. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/