Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755048AbYAVFaH (ORCPT ); Tue, 22 Jan 2008 00:30:07 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751512AbYAVF3y (ORCPT ); Tue, 22 Jan 2008 00:29:54 -0500 Received: from py-out-1112.google.com ([64.233.166.181]:6645 "EHLO py-out-1112.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751072AbYAVF3w (ORCPT ); Tue, 22 Jan 2008 00:29:52 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=AfAn57HBh/7sp5g0v8NyBCQsoFT2gby1jFq4gppx+MO8v7WBB52FQHovwgw+4z0uhpdDbpM2qktKMTdmLM2qQIoG8tY9SJkv9AwAekX2s4cDAeX3dAJlwAbT0EIyVqS3azirLklWzpVFDbisbm39P53chbkwdoMhQ9UaUj/UVvE= Message-ID: <170fa0d20801212129v1504b7eao4f0965ac1717e424@mail.gmail.com> Date: Tue, 22 Jan 2008 00:29:50 -0500 From: "Mike Snitzer" To: linux-raid@vger.kernel.org, NeilBrown Subject: Re: 2.6.22.16 MD raid1 doesn't mark removed disk faulty, MD thread goes UN Cc: linux-kernel@vger.kernel.org, "K. Tanaka" In-Reply-To: <170fa0d20801211504y5ebea9adka614817619d7a05f@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <170fa0d20801211504y5ebea9adka614817619d7a05f@mail.gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8371 Lines: 169 cc'ing Tanaka-san given his recent raid1 BUG report: http://lkml.org/lkml/2008/1/14/515 On Jan 21, 2008 6:04 PM, Mike Snitzer wrote: > Under 2.6.22.16, I physically pulled a SATA disk (/dev/sdac, connected to > an aacraid controller) that was acting as the local raid1 member of > /dev/md30. > > Linux MD didn't see an /dev/sdac1 error until I tried forcing the issue by > doing a read (with dd) from /dev/md30: > > Jan 21 17:08:07 lab17-233 kernel: sd 2:0:27:0: [sdac] Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK > Jan 21 17:08:07 lab17-233 kernel: sd 2:0:27:0: [sdac] Sense Key : > Hardware Error [current] > Jan 21 17:08:07 lab17-233 kernel: Info fld=0x0 > Jan 21 17:08:07 lab17-233 kernel: sd 2:0:27:0: [sdac] Add. Sense: > Internal target failure > Jan 21 17:08:07 lab17-233 kernel: end_request: I/O error, dev sdac, sector 71 > Jan 21 17:08:07 lab17-233 kernel: printk: 3 messages suppressed. > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 8 > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 16 > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 24 > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 32 > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 40 > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 48 > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 56 > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 64 > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 72 > Jan 21 17:08:07 lab17-233 kernel: raid1: sdac1: rescheduling sector 80 > Jan 21 17:08:07 lab17-233 kernel: sd 2:0:27:0: [sdac] Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK > Jan 21 17:08:07 lab17-233 kernel: sd 2:0:27:0: [sdac] Sense Key : > Hardware Error [current] > Jan 21 17:08:07 lab17-233 kernel: Info fld=0x0 > Jan 21 17:08:07 lab17-233 kernel: sd 2:0:27:0: [sdac] Add. Sense: > Internal target failure > Jan 21 17:08:07 lab17-233 kernel: end_request: I/O error, dev sdac, sector 343 > Jan 21 17:08:08 lab17-233 kernel: sd 2:0:27:0: [sdac] Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK > Jan 21 17:08:08 lab17-233 kernel: sd 2:0:27:0: [sdac] Sense Key : > Hardware Error [current] > Jan 21 17:08:08 lab17-233 kernel: Info fld=0x0 > ... > Jan 21 17:08:12 lab17-233 kernel: sd 2:0:27:0: [sdac] Add. Sense: > Internal target failure > Jan 21 17:08:12 lab17-233 kernel: end_request: I/O error, dev sdac, sector 3399 > Jan 21 17:08:12 lab17-233 kernel: printk: 765 messages suppressed. > Jan 21 17:08:12 lab17-233 kernel: raid1: sdac1: rescheduling sector 3336 > > However, the MD layer still hasn't marked the sdac1 member faulty: > > md30 : active raid1 nbd2[1](W) sdac1[0] > 4016204 blocks super 1.0 [2/2] [UU] > bitmap: 1/8 pages [4KB], 256KB chunk > > The dd I used to read from /dev/md30 is blocked on IO: > > Jan 21 17:13:55 lab17-233 kernel: dd D 00000afa9cf5c346 > 0 12337 7702 (NOTLB) > Jan 21 17:13:55 lab17-233 kernel: ffff81010c449868 0000000000000082 > 0000000000000000 ffffffff80268f14 > Jan 21 17:13:55 lab17-233 kernel: ffff81015da6f320 ffff81015de532c0 > 0000000000000008 ffff81012d9d7780 > Jan 21 17:13:55 lab17-233 kernel: ffff81015fae2880 0000000000004926 > ffff81012d9d7970 00000001802879a0 > Jan 21 17:13:55 lab17-233 kernel: Call Trace: > Jan 21 17:13:55 lab17-233 kernel: [] mempool_alloc+0x24/0xda > Jan 21 17:13:55 lab17-233 kernel: [] > :raid1:wait_barrier+0x84/0xc2 > Jan 21 17:13:55 lab17-233 kernel: [] > default_wake_function+0x0/0xe > Jan 21 17:13:55 lab17-233 kernel: [] > :raid1:make_request+0x83/0x5c0 > Jan 21 17:13:55 lab17-233 kernel: [] > __make_request+0x57f/0x668 > Jan 21 17:13:55 lab17-233 kernel: [] > generic_make_request+0x26e/0x2a9 > Jan 21 17:13:55 lab17-233 kernel: [] mempool_alloc+0x24/0xda > Jan 21 17:13:55 lab17-233 kernel: [] __next_cpu+0x19/0x28 > Jan 21 17:13:55 lab17-233 kernel: [] submit_bio+0xb6/0xbd > Jan 21 17:13:55 lab17-233 kernel: [] submit_bh+0xdf/0xff > Jan 21 17:13:55 lab17-233 kernel: [] > block_read_full_page+0x271/0x28e > Jan 21 17:13:55 lab17-233 kernel: [] > blkdev_get_block+0x0/0x46 > Jan 21 17:13:55 lab17-233 kernel: [] > radix_tree_insert+0xcb/0x18c > Jan 21 17:13:55 lab17-233 kernel: [] > __do_page_cache_readahead+0x16d/0x1df > Jan 21 17:13:55 lab17-233 kernel: [] getnstimeofday+0x32/0x8d > Jan 21 17:13:55 lab17-233 kernel: [] ktime_get_ts+0x1a/0x4e > Jan 21 17:13:55 lab17-233 kernel: [] delayacct_end+0x7d/0x88 > Jan 21 17:13:55 lab17-233 kernel: [] > blockable_page_cache_readahead+0x53/0xb2 > Jan 21 17:13:55 lab17-233 kernel: [] > make_ahead_window+0x82/0x9e > Jan 21 17:13:55 lab17-233 kernel: [] > page_cache_readahead+0x18a/0x1c1 > Jan 21 17:13:55 lab17-233 kernel: [] > do_generic_mapping_read+0x135/0x3fc > Jan 21 17:13:55 lab17-233 kernel: [] > file_read_actor+0x0/0x170 > Jan 21 17:13:55 lab17-233 kernel: [] > generic_file_aio_read+0x119/0x155 > Jan 21 17:13:55 lab17-233 kernel: [] do_sync_read+0xc9/0x10c > Jan 21 17:13:55 lab17-233 kernel: [] > autoremove_wake_function+0x0/0x2e > Jan 21 17:13:55 lab17-233 kernel: [] > do_mmap_pgoff+0x639/0x7a5 > Jan 21 17:13:55 lab17-233 kernel: [] vfs_read+0xcb/0x153 > Jan 21 17:13:55 lab17-233 kernel: [] sys_read+0x45/0x6e > Jan 21 17:13:55 lab17-233 kernel: [] tracesys+0xdc/0xe1 > Jan 21 17:13:55 lab17-233 kernel: > > The md30 kernel thread is waiting on IO, under live crash I see the > md30_raid1 thread has: > > crash> ps | grep md30_raid1 > 8744 2 1 ffff81013e39d800 UN 0.0 0 0 [md30_raid1] > crash> bt 8744 > PID: 8744 TASK: ffff81013e39d800 CPU: 1 COMMAND: "md30_raid1" > #0 [ffff81013e363cc0] schedule at ffffffff80457ddc > #1 [ffff81013e363db8] raid1d at ffffffff88b93539 > #2 [ffff81013e363ed8] md_thread at ffffffff803d4f9d > #3 [ffff81013e363f28] kthread at ffffffff80245651 > #4 [ffff81013e363f48] kernel_thread at ffffffff8020aa38 > crash> dis ffffffff88b93520 10 > 0xffffffff88b93520 : add %eax,(%rax) > 0xffffffff88b93522 : add %al,(%rax) > 0xffffffff88b93524 : sti > 0xffffffff88b93525 : mov (%r14),%rax > 0xffffffff88b93528 : mov 0x260(%rax),%rdi > 0xffffffff88b9352f : callq 0xffffffff88b912e0 > 0xffffffff88b93534 : callq 0xffffffff80457480 > <__sched_text_start> > 0xffffffff88b93539 : mov %rbx,%rdi > 0xffffffff88b9353c : callq 0xffffffff80459796 > <_spin_lock_irq> > 0xffffffff88b93541 : jmp 0xffffffff88b934f4 > The raid1d thread is locked at line 720 in raid1.c (raid1d+2437); aka freeze_array: (gdb) l *0x0000000000002539 0x2539 is in raid1d (drivers/md/raid1.c:720). 715 * wait until barrier+nr_pending match nr_queued+2 716 */ 717 spin_lock_irq(&conf->resync_lock); 718 conf->barrier++; 719 conf->nr_waiting++; 720 wait_event_lock_irq(conf->wait_barrier, 721 conf->barrier+conf->nr_pending == conf->nr_queued+2, 722 conf->resync_lock, 723 raid1_unplug(conf->mddev->queue)); 724 spin_unlock_irq(&conf->resync_lock); Given Tanaka-san's report against 2.6.23 and me hitting what seems to be the same deadlock in 2.6.22.16; it stands to reason this affects raid1 in 2.6.24-rcX too. Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/