Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750721AbVKQKPK (ORCPT ); Thu, 17 Nov 2005 05:15:10 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750724AbVKQKPK (ORCPT ); Thu, 17 Nov 2005 05:15:10 -0500 Received: from ookhoi.xs4all.nl ([213.84.114.66]:40668 "EHLO favonius.humilis.net") by vger.kernel.org with ESMTP id S1750721AbVKQKPI (ORCPT ); Thu, 17 Nov 2005 05:15:08 -0500 Date: Thu, 17 Nov 2005 11:15:11 +0100 From: Sander To: Sander Cc: Neil Brown , Andrew Morton , linux-kernel@vger.kernel.org, reiserfs-dev@namesys.com Subject: Re: segfault mdadm --write-behind, 2.6.14-mm2 (was: Re: RAID1 ramdisk patch) Message-ID: <20051117101511.GB2883@favonius> Reply-To: sander@humilis.net References: <431B9558.1070900@baanhofman.nl> <17179.40731.907114.194935@cse.unsw.edu.au> <20051116133639.GA18274@favonius> <20051116142000.5c63449f.akpm@osdl.org> <17275.48113.533555.948181@cse.unsw.edu.au> <20051117075041.GA5563@favonius> <20051117101251.GA2883@favonius> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20051117101251.GA2883@favonius> X-Uptime: 10:11:06 up 2 days, 22:47, 20 users, load average: 1.07, 1.67, 1.78 User-Agent: Mutt/1.5.11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4483 Lines: 98 Sander wrote (ao): # Sander wrote (ao): # # Neil Brown wrote (ao): # # > On Wednesday November 16, akpm@osdl.org wrote: # # > > Sander wrote: # # > > > With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I # # > > > try this: # # > > # # > > It oopsed in reiser4. reiserfs-dev added to Cc... # # > > # # > # # > Hmm... It appears that md/bitmap is calling prepare_write and # # > commit_write with 'file' as NULL - this works for some filesystems, # # > but not for reiser4. # # > # # > Does this patch help. # # # # Something changed, but it didn't fix it it seems: # # # # # mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1 # # mdadm: RUN_ARRAY failed: No such file or directory # # FWIW, the following happens when I point --bitmap to /tmp/raid1.bitmap # which is tmpfs, and also happens when I attach both loop0 and loop1 to # files on tmpfs. # # This would suggest that reiser4 is not solely at fault? # # The difference btw is that I can reboot with 'shutdown -r now' # instead of sysrq. And that mdadm hangs: # # # mdadm -C /dev/md1 --bitmap=/tmp/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1 # mdadm: RUN_ARRAY failed: No such file or directory # # # mdadm -C /dev/md1 -f --bitmap=/tmp/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1 # mdadm: /dev/loop0 appears to be part of a raid array: # level=raid1 devices=2 ctime=Thu Nov 17 11:04:31 2005 # mdadm: /dev/loop1 appears to be part of a raid array: # level=raid1 devices=2 ctime=Thu Nov 17 11:04:31 2005 # Continue creating array? yes # [hang, no prompt, no reaction to ctrl-c, etc] And even more info. It seems mdadm spins: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 749 root 25 0 1696 568 492 R 99.9 0.1 8:32.50 mdadm Would sysrq-t be useful? # [42949549.780000] md: bind # [42949549.780000] md: bind # [42949549.780000] md: md1: raid array is not clean -- starting background reconstruction # [42949549.790000] md1: bitmap file is out of date (0 < 1) -- forcing full recovery # [42949549.790000] md1: bitmap file is out of date, doing full recovery # [42949549.790000] md1: bitmap initialized from disk: read 0/4 pages, set 0 bits, status: 524288 # [42949549.790000] Bad page state at free_hot_cold_page (in process 'mdadm', page c10dcc20) # [42949549.790000] flags:0x80000019 mapping:f5155c84 mapcount:0 count:0 # [42949549.790000] Backtrace: # [42949549.790000] [] bad_page+0x70/0xb0 # [42949549.790000] [] free_hot_cold_page+0x51/0xd0 # [42949549.790000] [] bitmap_file_put+0x30/0x70 # [42949549.790000] [] bitmap_free+0x1e/0xb0 # [42949549.790000] [] bitmap_create+0xd6/0x2a0 # [42949549.790000] [] do_md_run+0x2ba/0x500 # [42949549.790000] [] add_new_disk+0x157/0x3b0 # [42949549.790000] [] mpage_writepages+0x124/0x3d0 # [42949549.790000] [] __pagevec_free+0x3e/0x60 # [42949549.790000] [] release_pages+0x29/0x160 # [42949549.790000] [] md_ioctl+0x5a1/0x630 # [42949549.790000] [] find_get_pages+0x18/0x40 # [42949549.790000] [] md_ioctl+0x0/0x630 # [42949549.790000] [] blkdev_driver_ioctl+0x54/0x60 # [42949549.790000] [] blkdev_ioctl+0x134/0x180 # [42949549.790000] [] block_ioctl+0x18/0x20 # [42949549.790000] [] block_ioctl+0x0/0x20 # [42949549.790000] [] do_ioctl+0x1f/0x70 # [42949549.790000] [] vfs_ioctl+0x5c/0x1e0 # [42949549.790000] [] __fput+0xe1/0x140 # [42949549.790000] [] sys_ioctl+0x3d/0x70 # [42949549.790000] [] syscall_call+0x7/0xb # [42949549.790000] Trying to fix it up, but a reboot is needed # [42949549.790000] md1: failed to create bitmap (524288) # [42949549.790000] md: pers->run() failed ... # [42949549.790000] md: md1 stopped. # [42949549.790000] md: unbind # [42949549.790000] md: export_rdev(loop1) # [42949549.790000] md: unbind # [42949549.790000] md: export_rdev(loop0) -- Humilis IT Services and Solutions http://www.humilis.net - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/