Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S265516AbTFMUIU (ORCPT ); Fri, 13 Jun 2003 16:08:20 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S265518AbTFMUIU (ORCPT ); Fri, 13 Jun 2003 16:08:20 -0400 Received: from angband.namesys.com ([212.16.7.85]:30594 "EHLO angband.namesys.com") by vger.kernel.org with ESMTP id S265516AbTFMUIS (ORCPT ); Fri, 13 Jun 2003 16:08:18 -0400 Date: Sat, 14 Jun 2003 00:22:05 +0400 From: Oleg Drokin To: Christian Jaeger Cc: linux-kernel@vger.kernel.org Subject: Re: Lockups with loop'ed sparse files on reiserfs? Message-ID: <20030613202205.GB22032@namesys.com> References: <20030613155634.GA18478@namesys.com> <20030613155934.GA19307@namesys.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1976 Lines: 55 Hello! On Fri, Jun 13, 2003 at 08:07:55PM +0200, Christian Jaeger wrote: > >Any chance to hit say sysrq-T/sysrq-P to find out where CPU spins? > I've never used those, I'll have to learn about those debugging > options first. Where should I go to? Read /usr/src/linux/Documentation/sysrq.txt > Now the question is wbat happens if a partition is full. There were a known problem with reiserfs that it might sometimes deadlock in out-of-space situation. This is fixed in 2.4.21 > In fact I've seen this in kern.log (full log at > http://pflanze.mine.nu/~chris/scratch/kern.log ): > Jun 13 11:34:57 pflanze kernel: raid5: md0, not all disks are > operational -- trying to recover array > ... > Jun 13 11:34:57 pflanze kernel: md0: resyncing spare disk [dev 07:07] > to replace failed disk This is raid5 stuff resyncing. Probably it is normal if you just setup the raid5 array. > What does happen if a raid array fails (i.e. 2 disks fail and there's > no spare, or 1 spare and 3 disks fail etc.)? If it's not an important Everything that will access this array will break, I presume ;) > array (i.e. no swap or root filesystem on it), is there a reason for > the system to go down? Isn't it possible to just mark the mounted > filesystem as erroneous and return EIO to applications accessing it? Something like that will happen. > There's also the case 1, using uml. In this case I'm sure there was > no problem with space. The sparse filesystem image file I used is > exactly 500'000'000 bytes, and there's 1675228 k free space on the > partition where it is put on. Ok, that's where sysrq-T/sysrq-P traceswould be most useful. And if you'd try with 2.4.21 that would be even better. Thank you. Bye, Oleg - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/