Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752390AbXBRWul (ORCPT ); Sun, 18 Feb 2007 17:50:41 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752392AbXBRWul (ORCPT ); Sun, 18 Feb 2007 17:50:41 -0500 Received: from mail-gw3.sa.ew.hu ([212.108.200.82]:39875 "EHLO mail-gw3.sa.ew.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752391AbXBRWuk (ORCPT ); Sun, 18 Feb 2007 17:50:40 -0500 To: akpm@linux-foundation.org CC: linux-kernel@vger.kernel.org, linux-mm@kvack.org In-reply-to: <20070218125307.4103c04a.akpm@linux-foundation.org> (message from Andrew Morton on Sun, 18 Feb 2007 12:53:07 -0800) Subject: Re: dirty balancing deadlock References: <20070218125307.4103c04a.akpm@linux-foundation.org> Message-Id: From: Miklos Szeredi Date: Sun, 18 Feb 2007 23:50:14 +0100 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1944 Lines: 47 > > I was testing the new fuse shared writable mmap support, and finding > > that bash-shared-mapping deadlocks (which isn't so strange ;). What > > is more strange is that this is not an OOM situation at all, with > > plenty of free and cached pages. > > > > A little more investigation shows that a similar deadlock happens > > reliably with bash-shared-mapping on a loopback mount, even if only > > half the total memory is used. > > > > The cause is slightly different in the two cases: > > > > - loopback mount: allocation by the underlying filesystem is stalled > > on throttle_vm_writeout() > > > > - fuse-loop: page dirtying on the underlying filesystem is stalled on > > balance_dirty_pages() > > > > In both cases the underlying fs is totally innocent, with no > > dirty/writback pages, yet it's waiting for the global dirty+writeback > > to go below the threshold, which obviously won't, until the > > allocation/dirtying succeeds. > > > > I'm not quite sure what the solution is, and asking for thoughts. > > But.... these things don't just throttle. They also perform large amounts > of writeback, which causes the dirty levels to subside. > > >From your description it appears that this writeback isn't happening, or > isn't working. How come? - filesystems A and B - write to A will end up as write to B - dirty pages in A manage to go over dirty_threshold - page writeback is started from A - this triggers writeback for a couple of pages in B - writeback finishes normally, but dirty+writeback pages are still over threshold - balance_dirty_pages in B gets stuck, nothing ever moves after this At least this is my theory for what happens. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/