From: Jan Kara Subject: Re: [PATCH] ext4: Fix delalloc sync hang with journal lock inversion Date: Wed, 11 Jun 2008 14:41:57 +0200 Message-ID: <20080611124157.GB8121@duck.suse.cz> References: <1212154769-16486-4-git-send-email-aneesh.kumar@linux.vnet.ibm.com> <1212154769-16486-5-git-send-email-aneesh.kumar@linux.vnet.ibm.com> <1212154769-16486-6-git-send-email-aneesh.kumar@linux.vnet.ibm.com> <1212154769-16486-7-git-send-email-aneesh.kumar@linux.vnet.ibm.com> <20080602093459.GC30613@duck.suse.cz> <20080602095956.GB9225@skywalker> <20080602102759.GG30613@duck.suse.cz> <20080605135413.GI8942@skywalker> <20080605162209.GG27370@duck.suse.cz> <20080605191909.GD4723@skywalker> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: cmm@us.ibm.com, linux-ext4@vger.kernel.org To: "Aneesh Kumar K.V" Return-path: Received: from styx.suse.cz ([82.119.242.94]:46998 "EHLO mail.suse.cz" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752231AbYFKMl6 (ORCPT ); Wed, 11 Jun 2008 08:41:58 -0400 Content-Disposition: inline In-Reply-To: <20080605191909.GD4723@skywalker> Sender: linux-ext4-owner@vger.kernel.org List-ID: On Fri 06-06-08 00:49:09, Aneesh Kumar K.V wrote: > On Thu, Jun 05, 2008 at 06:22:09PM +0200, Jan Kara wrote: > > I like it. I'm only not sure whether there cannot be two users of > > write_cache_pages() operating on the same mapping at the same time. Because > > then they could alter writeback_index under each other and that would > > probably result in unpleasant behavior. I think there can be two parallel > > calls for example from sync_single_inode() and sync_page_range(). > > In that case we'd need something like writeback_index inside wbc (or > > maybe just alter range_start automatically when range_cont is set?) so that > > parallel callers do no influence each other. > > > > commit e56edfdeea0d336e496962782f08e1224a101cf2 > Author: Aneesh Kumar K.V > Date: Fri Jun 6 00:47:35 2008 +0530 > > mm: Add range_cont mode for writeback. > > Filesystems like ext4 needs to start a new transaction in > the writepages for block allocation. This happens with delayed > allocation and there is limit to how many credits we can request > from the journal layer. So we call write_cache_pages multiple > times with wbc->nr_to_write set to the maximum possible value > limitted by the max journal credits available. > > Add a new mode to writeback that enables us to handle this > behaviour. If mapping->writeback_index is not set we use > wbc->range_start to find the start index and then at the end > of write_cache_pages we store the index in writeback_index. Next > call to write_cache_pages will start writeout from writeback_index. > Also we limit writing to the specified wbc->range_end. I think this changelog is out of date... > Signed-off-by: Aneesh Kumar K.V > > diff --git a/include/linux/writeback.h b/include/linux/writeback.h > index f462439..0d8573e 100644 > --- a/include/linux/writeback.h > +++ b/include/linux/writeback.h > @@ -63,6 +63,7 @@ struct writeback_control { > unsigned for_writepages:1; /* This is a writepages() call */ > unsigned range_cyclic:1; /* range_start is cyclic */ > unsigned more_io:1; /* more io to be dispatched */ > + unsigned range_cont:1; > }; > > /* > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > index 789b6ad..182233b 100644 > --- a/mm/page-writeback.c > +++ b/mm/page-writeback.c > @@ -882,6 +882,9 @@ int write_cache_pages(struct address_space *mapping, > if (wbc->range_cyclic) { > index = mapping->writeback_index; /* Start from prev offset */ > end = -1; > + } else if (wbc->range_cont) { > + index = wbc->range_start >> PAGE_CACHE_SHIFT; > + end = wbc->range_end >> PAGE_CACHE_SHIFT; Hmm, why isn't this in the next else? > } else { > index = wbc->range_start >> PAGE_CACHE_SHIFT; > end = wbc->range_end >> PAGE_CACHE_SHIFT; > @@ -956,6 +959,9 @@ int write_cache_pages(struct address_space *mapping, > } > if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) > mapping->writeback_index = index; > + > + if (wbc->range_cont) > + wbc->range_start = index << PAGE_CACHE_SHIFT; > return ret; > } > EXPORT_SYMBOL(write_cache_pages); Honza -- Jan Kara SUSE Labs, CR