Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Sun, 8 Jul 2001 13:59:45 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Sun, 8 Jul 2001 13:59:25 -0400 Received: from neon-gw.transmeta.com ([209.10.217.66]:52487 "EHLO neon-gw.transmeta.com") by vger.kernel.org with ESMTP id ; Sun, 8 Jul 2001 13:59:20 -0400 Date: Sun, 8 Jul 2001 10:58:20 -0700 (PDT) From: Linus Torvalds To: Rik van Riel cc: Mike Galbraith , Jeff Garzik , Daniel Phillips , Subject: Re: VM in 2.4.7-pre hurts... In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 8 Jul 2001, Rik van Riel wrote: > > ... Bingo. You hit the infamous __wait_on_buffer / ___wait_on_page > bug. I've seen this for quite a while now on our quad xeon test > machine, with some kernel versions it can be reproduced in minutes, > with others it won't trigger at all. Hmm.. That would explain why the "tar" gets stuck, but why does the whole machine grind to a halt with all other processes being marked runnable? > I hope there is somebody out there who can RELIABLY trigger > this bug, so we have a chance of tracking it down. > > > tar > > Trace; c012f2da <__wait_on_buffer+6a/8c> > > Trace; c01303c9 I wonder if "getblk()" returned a locked not-up-to-date buffer.. That would explain how the buffer stays locked forever - the "ll_rw_block()" will not actually submit any IO on a locked buffer, so there won't be any IO to release it. And it's interesting to see that this happens for a _inode_ block, not a data block - which could easily have been dirty and scheduled for a write-out. So I wonder if there is some race between "write buffer and try to free it" and "getblk()". The locking in "try_to_free_buffers()" is rather anal, so I don't see how this could happen, but.. That still doesn't explain why everybody is busy running. I'd have expected all the processes to end up waiting for the page or buffer, not stuck in a live-lock. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/