Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754709AbZCBPFZ (ORCPT ); Mon, 2 Mar 2009 10:05:25 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752325AbZCBPFJ (ORCPT ); Mon, 2 Mar 2009 10:05:09 -0500 Received: from an-out-0708.google.com ([209.85.132.249]:43654 "EHLO an-out-0708.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752254AbZCBPFH convert rfc822-to-8bit (ORCPT ); Mon, 2 Mar 2009 10:05:07 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:to:cc:in-reply-to:references:content-type:date :message-id:mime-version:x-mailer:content-transfer-encoding; b=Y03Tdqu4MQFT5yt3L+fMA1UrdCaZN6M6uDVRpWY0OScGOPQzlIkc4u0W7ORDImW19a yIUWibROo0zx/12EQcr6YnqnSD2qWZf+5uHHNUt1ooWTomao5cibiUATDyehaBvTCb9U bxyFAARKXYq+4ryJrWDXHdQ41QllGNLqNIAtA= Subject: Re: XFS kernel BUG at fs/buffer.c:470! with 2.6.28.4 From: Alessandro Bono To: Jan Kara Cc: Dave Chinner , Christoph Hellwig , linux-xfs , linux-kernel In-Reply-To: <20090302133609.GD23880@duck.suse.cz> References: <1234011974.7435.11.camel@champagne.cantina> <20090208222859.GA2532@infradead.org> <1234132752.12370.0.camel@champagne.cantina> <20090208224249.GA11931@infradead.org> <1234133120.12370.7.camel@champagne.cantina> <20090209075308.GA7360@infradead.org> <20090210104304.GP8830@disturbed> <1234432077.9204.15.camel@champagne.cantina> <20090226165838.GA9602@atrey.karlin.mff.cuni.cz> <1235763895.5743.3.camel@champagne> <20090302133609.GD23880@duck.suse.cz> Content-Type: text/plain; charset=US-ASCII Date: Mon, 02 Mar 2009 16:04:59 +0100 Message-Id: <1236006299.5744.7.camel@champagne> Mime-Version: 1.0 X-Mailer: Evolution 2.24.3 Content-Transfer-Encoding: 7BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8454 Lines: 105 On Mon, 2009-03-02 at 14:36 +0100, Jan Kara wrote: > On Fri 27-02-09 20:44:55, Alessandro Bono wrote: > > On Thu, 2009-02-26 at 17:58 +0100, Jan Kara wrote: > > .... > > > > any suggestion? > > > Hmm, are you still able to reproduce the problem? As I'm looking into > > > registers in your dump, no register really seems to contain sensible page > > > flags so it could be some corruption of page pointer. If you are still > > > able to reproduce, could you please do so with the attached patch > > > applied? It will dump us much more information... Thanks. > > > > > > Honza > > > > > > > this time with plain xfs, no lvm no dm_crypt > > just to add some info > > xfs fs created with this command > > > > mkfs.xfs -f -l lazy-count=1,version=2,size=128m -i attr=2 -d agcount=4 > > -L store /dev/sdb1 > OK, thanks for the data. I've looked at both your traces and I think the > most likely cause is a bit-flip in memory. Look: The page address (we got > from bh->b_page) is ffffe20000885604. IMHO it should be ffffe20000885600 > because if you look e.g. at mapping, it is 000ac69affff8800, which is a > bogus value but it would look like a valid pointer if we shift it by 32 > bits to the left. The same with private pointer. Index also looks absurdly > high but low 32-bits are actually 0 and previous 4 bytes in the page > structure (we have them shown in the upper 4 bytes of mapping pointer) > contain 0x000ac69a => so given together page index would be 706202 - > perfectly sensible value. > And in the second report it is the same. Also buffer head looks perfectly > valid in both cases. > Such bit flips are usually caused by faulty memory or other HW (io > controler etc.) so I suggest trying to shuffle the hardware somehow - > change memory DIMMs as a starter, running memtest if you don't have a spare > DIMMs but it is not an exception that even though memtest runs just fine > for a long time, memory is really at fault. Ok, I'll change DIMMs, retest and report back Thanks a lot for your support > > Honza > > > Feb 27 19:20:28 champagne kernel: [ 2664.993272] Buffer ffff8800371e7e70 of page ffffe20000885604 not private! Some data to debug: > > Feb 27 19:20:28 champagne kernel: [ 2664.993282] flags: 240000000, mapping: 000ac69affff8800, index: 54276223274057728, private: 2489e50ffff8800 > > Feb 27 19:20:28 champagne kernel: [ 2664.993288] Buffer: state=125, block=23795642, b_size=4096, b_this_page=ffff8800371e7e70 > > Feb 27 19:20:28 champagne kernel: [ 2664.993293] Other buffers in the page: > > Feb 27 19:20:28 champagne kernel: [ 2664.993355] ------------[ cut here ]------------ > > Feb 27 19:20:28 champagne kernel: [ 2664.993361] Kernel BUG at ffffffff802b3653 [verbose debug info unavailable] > > Feb 27 19:20:28 champagne kernel: [ 2664.993366] invalid opcode: 0000 [#1] SMP > > Feb 27 19:20:28 champagne kernel: [ 2664.993373] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq > > Feb 27 19:20:28 champagne kernel: [ 2664.993378] CPU 1 > > Feb 27 19:20:28 champagne kernel: [ 2664.993382] Modules linked in: xfs nls_iso8859_1 nls_cp437 vfat fat nls_base usb_storage libusual af_packet binfmt_misc rfcomm bridge stp llc bnep sco l2cap kvm_intel kvm ppdev ipv6 acpi_cpufreq cpufreq_powersave cpufreq_stats cpufreq_userspace cpufreq_ondemand freq_table cpufreq_conservative pci_slot sbs sbshc iptable_filter ip_tables x_tables dm_crypt sbp2 lp snd_hda_intel snd_hwdep snd_pcm_oss arc4 snd_pcm snd_page_alloc ecb snd_mixer_oss snd_seq_dummy snd_seq_oss iwlagn iwlcore usbhid snd_seq_midi rfkill btusb snd_rawmidi snd_seq_midi_event snd_seq hid parport_pc snd_timer mac80211 pcmcia bluetooth parport snd_seq_device lis3lv02d leds_hp_disk sdhci_pci sdhci snd video output ricoh_mmc pcspkr tpm_infineon tpm tpm_bios cfg80211 mmc_core led_class yenta_socket rsrc_nonstatic pcmcia_core psmouse container wmi battery ac button soundcore serio_raw iTCO_wdt iTCO_vendor_support evdev dm_multipath ext3 jbd mbcache sr_mod cdrom sg sd_mod crc_t10dif ohci1394 ata_piix ahci ieee1394 l > > Feb 27 19:20:28 champagne kernel: bata scsi_mod ehci_hcd uhci_hcd usbcore e1000e dm_mirror dm_region_hash dm_log dm_snapshot dm_mod thermal processor fan thermal_sys hwmon fuse > > Feb 27 19:20:28 champagne kernel: [ 2664.993553] Pid: 8367, comm: xfsdatad/1 Not tainted 2.6.28.7 #1 > > Feb 27 19:20:28 champagne kernel: [ 2664.993558] RIP: 0010:[] [] end_buffer_async_write+0x119/0x18d > > Feb 27 19:20:28 champagne kernel: [ 2664.993573] RSP: 0018:ffff880084555e40 EFLAGS: 00010246 > > Feb 27 19:20:28 champagne kernel: [ 2664.993578] RAX: 0000000240000000 RBX: ffff8800371e7e70 RCX: ffffffff80554d00 > > Feb 27 19:20:28 champagne kernel: [ 2664.993584] RDX: ffff8800a7ae3000 RSI: 0000000000000046 RDI: ffffffff80572f50 > > Feb 27 19:20:28 champagne kernel: [ 2664.993589] RBP: ffff8800371e7e70 R08: 0000000000000000 R09: 0000000000000000 > > Feb 27 19:20:28 champagne kernel: [ 2664.993594] R10: 000000000000000a R11: 0000000000018600 R12: ffff8801159fd288 > > Feb 27 19:20:28 champagne kernel: [ 2664.993599] R13: ffffe20000885604 R14: ffff88013b85ff00 R15: 0000000000000001 > > Feb 27 19:20:28 champagne kernel: [ 2664.993605] FS: 0000000000000000(0000) GS:ffff88013b803a00(0000) knlGS:0000000000000000 > > Feb 27 19:20:28 champagne kernel: [ 2664.993611] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > > Feb 27 19:20:28 champagne kernel: [ 2664.993616] CR2: 00007f9deefb8ca8 CR3: 0000000117829000 CR4: 00000000000026e0 > > Feb 27 19:20:28 champagne kernel: [ 2664.993621] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > Feb 27 19:20:28 champagne kernel: [ 2664.993626] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > Feb 27 19:20:28 champagne kernel: [ 2664.993632] Process xfsdatad/1 (pid: 8367, threadinfo ffff880084554000, task ffff88008c1222b0) > > Feb 27 19:20:28 champagne kernel: [ 2664.993636] Stack: > > Feb 27 19:20:28 champagne kernel: [ 2664.993640] 0000000000000282 0000000000000004 ffff88000248b6c0 ffff8801159fd288 > > Feb 27 19:20:28 champagne kernel: [ 2664.993648] ffff88013b85fee0 0000000000000000 ffff88010dc356c0 ffff8801159fd288 > > Feb 27 19:20:28 champagne kernel: [ 2664.993657] ffff88013b85fee0 ffffffffa055a9f9 ffffffffa055ab6b ffff8801159fd280 > > Feb 27 19:20:28 champagne kernel: [ 2664.993667] Call Trace: > > Feb 27 19:20:28 champagne kernel: [ 2664.993671] [] ? xfs_destroy_ioend+0x23/0x71 [xfs] > > Feb 27 19:20:28 champagne kernel: [ 2664.993723] [] ? xfs_end_bio_delalloc+0x0/0x19 [xfs] > > Feb 27 19:20:28 champagne kernel: [ 2664.993770] [] ? xfs_end_bio_delalloc+0x0/0x19 [xfs] > > Feb 27 19:20:28 champagne kernel: [ 2664.993815] [] ? run_workqueue+0x79/0xfe > > Feb 27 19:20:28 champagne kernel: [ 2664.993826] [] ? worker_thread+0xd8/0xe7 > > Feb 27 19:20:28 champagne kernel: [ 2664.993834] [] ? autoremove_wake_function+0x0/0x2e > > Feb 27 19:20:28 champagne kernel: [ 2664.993842] [] ? worker_thread+0x0/0xe7 > > Feb 27 19:20:28 champagne kernel: [ 2664.993849] [] ? kthread+0x47/0x73 > > Feb 27 19:20:28 champagne kernel: [ 2664.993856] [] ? schedule_tail+0x27/0x5f > > Feb 27 19:20:28 champagne kernel: [ 2664.993864] [] ? child_rip+0xa/0x11 > > Feb 27 19:20:28 champagne kernel: [ 2664.993872] [] ? kthread+0x0/0x73 > > Feb 27 19:20:28 champagne kernel: [ 2664.993879] [] ? child_rip+0x0/0x11 > > Feb 27 19:20:28 champagne kernel: [ 2664.993885] Code: 10 4c 8b 43 20 48 8b 13 48 89 de 48 c7 c7 f8 ef 46 80 31 c0 e8 b2 db 13 00 48 8b 5b 08 48 39 eb 75 d7 49 8b 45 00 f6 c4 08 75 04 <0f> 0b eb fe 49 8b 5d 10 9c 41 5c fa eb 07 f3 90 f603 10 75 f9 > > Feb 27 19:20:28 champagne kernel: [ 2664.993950] RIP [] end_buffer_async_write+0x119/0x18d > > Feb 27 19:20:28 champagne kernel: [ 2664.993959] RSP > > Feb 27 19:20:28 champagne kernel: [ 2664.993985] ---[ end trace 2d34f97811caf921 ]--- > > > -- --- Cordiali Saluti Alessandro Bono -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/