Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Sun, 25 Nov 2001 17:18:20 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Sun, 25 Nov 2001 17:18:02 -0500 Received: from smtp6.mindspring.com ([207.69.200.110]:18965 "EHLO smtp6.mindspring.com") by vger.kernel.org with ESMTP id ; Sun, 25 Nov 2001 17:17:50 -0500 Message-ID: <3C016E08.3C2D2537@mindspring.com> Date: Sun, 25 Nov 2001 15:17:44 -0700 From: Jim Henderson X-Mailer: Mozilla 4.77 [en] (X11; U; Linux 2.4.14 i686) X-Accept-Language: en MIME-Version: 1.0 To: linux-kernel@vger.kernel.org Subject: PROBLEM: kernel BUG at filemap.c:791 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org After compiling a 2.2.14 kernel on one of my systems, I've started receiving this problem. Kernel is patched with the ext3 filesystem patches, however running without ext3 being used results in this problem as well. This can happen at any time, I've seen it during startup and after the system has been running for a couple of days; it is reproducable, but not at will (ie, it's a given that it will happen but unknown as to what causes it). Previous kernels I've run (2.4.2 and 2.4.6) do not exhibit this problem. Here's the crash info with symbols resolved: --- snip --- ksymoops 2.4.0 on i486 2.4.6. Options used -v /usr/src/linux/vmlinux (specified) -K (specified) -L (specified) -o /lib/modules/2.4.14/ (specified) -m /boot/System.map-2.4.14 (specified) No modules in ksyms, skipping objects kernel BUG at filemap.c:791! invalid operand: 0000 CPU: 0 EIP: 0010:[] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010086 eax: 0000001d ebx: c1133980 ecx: c0270660 edx: 00006181 esi: c0080134 edi: c11e3c00 ebp: 00000001 esp: c4ef5b50 ds: 0018 es: 0018 ss: 0018 Process syslogd (pid: 365, stackpage=c4ef5000) Stack: c022640b 00000317 00000000 c0080134 c01abbb4 c4cc96c0 00000001 00000002 c011a834 00000001 c02c65c0 00000000 c11444e0 24000001 0000000e c4ef5bd4 c01082aa 0000000e c11e3c00 c4ef5bd4 c4ef5bd4 0000000e c02bcac0 c11444e0 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] [] Code: 0f 0b 5a 59 8d 43 28 8d 73 24 39 43 28 74 19 89 f0 5b b9 01 >>EIP; c0122938 <===== Trace; c01abbb4 Trace; c011a834 Trace; c01082aa Trace; c010842d Trace; c01ab6bc Trace; c01176a0 Trace; c0117473 Trace; c010845c Trace; c011a740 Trace; c019ec3a <__make_request+fa/670> Trace; c0117473 Trace; c010845c Trace; c019f2ea Trace; c019e75e Trace; c0117858 <__run_task_queue+48/60> Trace; c012ed26 <__wait_on_buffer+56/90> Trace; c011778b Trace; c01176a0 Trace; c012f985 Trace; c010845c Trace; c01ab8ba Trace; c01d8b6c <__kfree_skb+dc/e0> Trace; c01da75d Trace; c020966f Trace; c02096cf Trace; c012fc95 <__refile_buffer+55/60> Trace; c0130640 <__block_commit_write+a0/c0> Trace; c0130b14 Trace; c0124db6 Trace; c0124e16 Trace; c012e248 Trace; c0124900 Trace; c0161d4a Trace; c015fb74 Trace; c012f1aa Trace; c0106ef3 Code; c0122938 00000000 <_EIP>: Code; c0122938 <===== 0: 0f 0b ud2a <===== Code; c012293a 2: 5a pop %edx Code; c012293b 3: 59 pop %ecx Code; c012293c 4: 8d 43 28 lea 0x28(%ebx),%eax Code; c012293f 7: 8d 73 24 lea 0x24(%ebx),%esi Code; c0122942 a: 39 43 28 cmp %eax,0x28(%ebx) Code; c0122945 d: 74 19 je 28 <_EIP+0x28> c0122960 Code; c0122947 f: 89 f0 mov %esi,%eax Code; c0122949 11: 5b pop %ebx Code; c012294a 12: b9 01 00 00 00 mov $0x1,%ecx <0>Kernel panic: Aiee, killing interrupt handler! --- snip --- The problem is reported in the unlock_page function in filemap.c with this bit of code: if (!test_and_clear_bit(PG_locked, &(page)->flags)) BUG(); Module-wise, the only options selected are for LAN drivers (not including the one actually being used) and a SYM710 SCSI controller (for an external CD-ROM drive that is powered off). The following info comes from a stable 2.4.6 kernel procfs output (since 2.2.14 doesn't want to stay running long enough for me to get this info from there): CPU Info: --- snip --- processor : 0 vendor_id : AuthenticAMD cpu family : 4 model : 14 model name : Am5x86-WT stepping : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu bogomips : 66.35 --- snip --- No SCSI drivers loaded, but a Compaq SMART controller is in use: --- snip --- ida0: Compaq SMART Controller Board ID: 0x2040110e Firmware Revision: 2.26 Controller Sig: 0x2ae7909f Memory Address: 0x00000000 I/O Port: 0x5000 IRQ: 14 Logical drives: 1 Physical drives: 2 Current Q depth: 0 Max Q depth since init: 42 Logical Drive Info: ida/c0d0: blksz=512 nr_blks=8217120 nr_allocs = 3961 nr_frees = 3961 --- snip --- The system is an EISA system, Compaq Prosignia VS with 80 MB of RAM and an AMD K5 processor, integrated AMD network adapter, Compaq SMART controller. The only other patch applied to this kernel is the one suggested in this list for loopback device (loop.c to remove the deactivate_page(page) calls which prevent the kernel from properly compiling. I have a second system (Celeron 300A processor, 640 MB of RAM, and IDE drives) that runs 2.2.14 with these same patches with no problems that I've seen. Please let me know if there's any other information needed to diagnose this problem - this is the first time I've ever had a kernel crash, and I hope I've included all the relevant information here. I don't subscribe to the linux-kernel list, so all correspondence should be copied to me directly as well if possible. Thanks, Jim - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/