Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762615AbXKHVr5 (ORCPT ); Thu, 8 Nov 2007 16:47:57 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1761332AbXKHVrr (ORCPT ); Thu, 8 Nov 2007 16:47:47 -0500 Received: from fisica.ufpr.br ([200.17.209.129]:55295 "EHLO fisica.ufpr.br" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760784AbXKHVrq (ORCPT ); Thu, 8 Nov 2007 16:47:46 -0500 X-Greylist: delayed 435 seconds by postgrey-1.27 at vger.kernel.org; Thu, 08 Nov 2007 16:47:46 EST MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18227.33346.994456.270194@fisica.ufpr.br> Date: Thu, 8 Nov 2007 19:40:18 -0200 To: Jeff Lessem , root@c3sl.ufpr.br Cc: Dan Williams , =?UTF-8?B?QkVSVFJBTkQgSm/Dq2w=?= , Justin Piszcz , Neil Brown , linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state In-Reply-To: <47314653.80905@Lessem.org> References: <18222.16003.92062.970530@notabene.brown> <47303FB8.7000801@systella.fr> <1194398700.2970.18.camel@dwillia2-linux.ch.intel.com> <47314653.80905@Lessem.org> X-Mailer: VM 7.19 under Emacs 22.1.1 From: carlos@fisica.ufpr.br (Carlos Carvalho) Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1485 Lines: 30 Jeff Lessem (Jeff@Lessem.org) wrote on 6 November 2007 22:00: >Dan Williams wrote: > > The following patch, also attached, cleans up cases where the code looks > > at sh->ops.pending when it should be looking at the consistent > > stack-based snapshot of the operations flags. > >I tried this patch (against a stock 2.6.23), and it did not work for >me. Not only did I/O to the effected RAID5 & XFS partition stop, but >also I/O to all other disks. I was not able to capture any debugging >information, but I should be able to do that tomorrow when I can hook >a serial console to the machine. > >I'm not sure if my problem is identical to these others, as mine only >seems to manifest with RAID5+XFS. The RAID rebuilds with no problem, >and I've not had any problems with RAID5+ext3. Us too! We're stuck trying to build a disk server with several disks in a raid5 array, and the rsync from the old machine stops writing to the new filesystem. It only happens under heavy IO. We can make it lock without rsync, using 8 simultaneous dd's to the array. All IO stops, including the resync after a newly created raid or after an unclean reboot. We could not trigger the problem with ext3 or reiser3; it only happens with xfs. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/