Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261534AbVAXRnh (ORCPT ); Mon, 24 Jan 2005 12:43:37 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261537AbVAXRnh (ORCPT ); Mon, 24 Jan 2005 12:43:37 -0500 Received: from mx1.redhat.com ([66.187.233.31]:2196 "EHLO mx1.redhat.com") by vger.kernel.org with ESMTP id S261534AbVAXRn3 (ORCPT ); Mon, 24 Jan 2005 12:43:29 -0500 Subject: Re: [Ext2-devel] [PATCH] JBD: fix against journal overflow From: "Stephen C. Tweedie" To: Alex Tomas Cc: Stephen Tweedie , linux-kernel , "ext2-devel@lists.sourceforge.net" , Andrew Morton In-Reply-To: References: Content-Type: text/plain Message-Id: <1106588589.2103.116.camel@sisko.sctweedie.blueyonder.co.uk> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 (1.4.5-9) Date: Mon, 24 Jan 2005 17:43:09 +0000 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1794 Lines: 47 Hi, On Wed, 2005-01-19 at 15:32, Alex Tomas wrote: > under some quite high load, jbd can hit J_ASSERT(journal->j_free > 1) > in journal_next_log_block(). The cause is the following: > > journal_commit_transaction() > { > struct buffer_head *wbuf[64]; > /* If there's no more to do, or if the descriptor is full, > let the IO rip! */ > if (bufs == ARRAY_SIZE(wbuf) || > commit_transaction->t_buffers == NULL || > space_left < sizeof(journal_block_tag_t) + 16) { > > so, the real limit isn't size of journal descriptor, but wbuf. I don't see how that "limit" is relevant here. wbuf is nothing but the size of the IO batches we pass to ll_rw_block() during that commit phase. j_free affects the total size of space the *entire* commit has to run into, and (as akpm has commented with a big marker beside it) start_this_handle() reserves a *lot* of headroom for the extra space that may be needed for transaction metadata. (The comment there about journal_extend() needing to match it looks correct, though --- that will need fixing.) The only case I've ever seen the j_free > 1 assert fail on was the xattr test that tridge was triggering with AG's first-generation xattr sharing fix last December, and that was a journal_release_buffer() credits accounting problem. So NAK --- the wbuf batch size just doesn't seem to be relevant to the problem being claimed. Have you really seen this patch make a difference in testing? Cheers, Stephen - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/