From: Jan Kara Subject: Re: [PATCH 2/3] ext4 : Pass GFP_KERNEL for transaction allocation if caller can handler failures Date: Tue, 25 Jan 2011 17:17:45 +0100 Message-ID: <20110125161745.GC4088@quack.suse.cz> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jan Kara , ext4 To: Manish Katiyar Return-path: Received: from cantor.suse.de ([195.135.220.2]:46677 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751426Ab1AYQRx (ORCPT ); Tue, 25 Jan 2011 11:17:53 -0500 Content-Disposition: inline In-Reply-To: Sender: linux-ext4-owner@vger.kernel.org List-ID: Hi, On Sat 22-01-11 19:34:55, Manish Katiyar wrote: > Pass GFP_KERNEL for transaction allocation for ext4 routines if caller > can handler failures Some error recovery paths will need cleaning up before you actually start using them - see below: > diff --git a/fs/ext4/acl.c b/fs/ext4/acl.c > index e0270d1..1a4a944 100644 > --- a/fs/ext4/acl.c > +++ b/fs/ext4/acl.c > @@ -351,7 +351,7 @@ ext4_acl_chmod(struct inode *inode) > > retry: > handle = ext4_journal_start(inode, > - EXT4_DATA_TRANS_BLOCKS(inode->i_sb)); > + EXT4_DATA_TRANS_BLOCKS(inode->i_sb), true); > if (IS_ERR(handle)) { > error = PTR_ERR(handle); > ext4_std_error(inode->i_sb, error); We shouldn't call ext4_std_error() (that possibly logs the message in the kernel log and remounts the fs read-only, panics the kernel or so) in case of ENOMEM... > @@ -449,7 +449,8 @@ ext4_xattr_set_acl(struct dentry *dentry, const > char *name, const void *value, > acl = NULL; > > retry: > - handle = ext4_journal_start(inode, EXT4_DATA_TRANS_BLOCKS(inode->i_sb)); > + handle = ext4_journal_start(inode, > + EXT4_DATA_TRANS_BLOCKS(inode->i_sb), true); > if (IS_ERR(handle)) > return PTR_ERR(handle); It's actually not your bug, but the above should be: error = PTR_ERR(handle); goto release_and_out; > error = ext4_set_acl(handle, inode, type, acl); > diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c > index eb9097a..e0e27a3 100644 > --- a/fs/ext4/ialloc.c > +++ b/fs/ext4/ialloc.c > @@ -1257,7 +1257,7 @@ extern int ext4_init_inode_table(struct > super_block *sb, ext4_group_t group, > if (gdp->bg_flags & cpu_to_le16(EXT4_BG_INODE_ZEROED)) > goto out; > > - handle = ext4_journal_start_sb(sb, 1); > + handle = ext4_journal_start_sb(sb, 1, true); > if (IS_ERR(handle)) { > ret = PTR_ERR(handle); > goto out; Well, this might be disputable. This function is used to lazily initialize inode table. If the initialization fails, thread removes the request for initialization from the queue. But in case of ENOMEM, it might be more suitable to just postpone the initialization work to a more suitable time... > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 9f7f9e4..76c20b8 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c ... > @@ -1398,7 +1398,7 @@ static int _ext4_get_block(struct inode *inode, > sector_t iblock, > if (map.m_len > DIO_MAX_BLOCKS) > map.m_len = DIO_MAX_BLOCKS; > dio_credits = ext4_chunk_trans_blocks(inode, map.m_len); > - handle = ext4_journal_start(inode, dio_credits); > + handle = ext4_journal_start(inode, dio_credits, false); > if (IS_ERR(handle)) { > ret = PTR_ERR(handle); > return ret; Hmm, this would be actually another useful prerequisite cleanup of this series. _ext4_get_block() should need to start a transaction only when called from direct IO path (otherwise transaction should be already started when creating blocks). But this is only implicit so it would be good to create ext4_get_block_directIO() which would start a transaction, use it as a callback of __blockdev_direct_IO(), and remove the code from standard _ext4_get_block() function. Then you can also make ext4_journal_start() possibly fail and still it will be clear you do not introduce any potential problems. > @@ -1621,7 +1621,7 @@ static int ext4_write_begin(struct file *file, > struct address_space *mapping, > to = from + len; > > retry: > - handle = ext4_journal_start(inode, needed_blocks); > + handle = ext4_journal_start(inode, needed_blocks, false); > if (IS_ERR(handle)) { > ret = PTR_ERR(handle); > goto out; Failing this with ENOMEM is OK. Note that grab_cache_page_write_begin() called just below can fail with ENOMEM as well. > @@ -3553,7 +3554,7 @@ static ssize_t ext4_ind_direct_IO(int rw, struct > kiocb *iocb, > > if (final_size > inode->i_size) { > /* Credits for sb + inode write */ > - handle = ext4_journal_start(inode, 2); > + handle = ext4_journal_start(inode, 2, false); > if (IS_ERR(handle)) { > ret = PTR_ERR(handle); > goto out; This can fail without introducing problems. It's standard directIO write path. > @@ -3596,7 +3597,7 @@ retry: > int err; > > /* Credits for sb + inode write */ > - handle = ext4_journal_start(inode, 2); > + handle = ext4_journal_start(inode, 2, false); > if (IS_ERR(handle)) { > /* This is really bad luck. We've written the data > * but cannot extend i_size. Bail out and pretend This one can fail just fine as well. > @@ -5329,8 +5330,10 @@ int ext4_setattr(struct dentry *dentry, struct > iattr *attr) > > /* (user+group)*(old+new) structure, inode write (sb, > * inode block, ? - but truncate inode update has it) */ > - handle = ext4_journal_start(inode, (EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)+ > - EXT4_MAXQUOTAS_DEL_BLOCKS(inode->i_sb))+3); > + handle = ext4_journal_start(inode, > + (EXT4_MAXQUOTAS_INIT_BLOCKS(inode->i_sb)+ > + EXT4_MAXQUOTAS_DEL_BLOCKS(inode->i_sb))+3, > + true); > if (IS_ERR(handle)) { > error = PTR_ERR(handle); > goto err_out; > @@ -5365,7 +5368,7 @@ int ext4_setattr(struct dentry *dentry, struct > iattr *attr) > (ext4_test_inode_flag(inode, EXT4_INODE_EOFBLOCKS)))) { > handle_t *handle; > > - handle = ext4_journal_start(inode, 3); > + handle = ext4_journal_start(inode, 3, true); > if (IS_ERR(handle)) { > error = PTR_ERR(handle); > goto err_out; The above two sites are fine but note that err_out calls ext4_std_error() which we don't want to happen in case of ENOMEM. > @@ -5822,7 +5825,7 @@ int ext4_change_inode_journal_flag(struct inode > *inode, int val) > > /* Finally we can mark the inode as dirty. */ > > - handle = ext4_journal_start(inode, 1); > + handle = ext4_journal_start(inode, 1, true); > if (IS_ERR(handle)) > return PTR_ERR(handle); This can fail OK, but you should undo inode flag and aops change before returning error (that would be probably better as a separate preparatory patch because it won't be completely trivial - you need to lock the updates again etc. possibly create a helper function for that so that you don't duplicate the code). > diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c > index eb3bc2f..3e7977b 100644 > --- a/fs/ext4/ioctl.c > +++ b/fs/ext4/ioctl.c > @@ -101,7 +101,7 @@ long ext4_ioctl(struct file *filp, unsigned int > cmd, unsigned long arg) > } else if (oldflags & EXT4_EOFBLOCKS_FL) > ext4_truncate(inode); > > - handle = ext4_journal_start(inode, 1); > + handle = ext4_journal_start(inode, 1, false); > if (IS_ERR(handle)) { > err = PTR_ERR(handle); > goto flags_out; This can handle failure just fine... > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index cb10a06..7714a15 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -4487,7 +4488,7 @@ static int ext4_acquire_dquot(struct dquot *dquot) > handle_t *handle; > > handle = ext4_journal_start(dquot_to_inode(dquot), > - EXT4_QUOTA_INIT_BLOCKS(dquot->dq_sb)); > + EXT4_QUOTA_INIT_BLOCKS(dquot->dq_sb), true); > if (IS_ERR(handle)) > return PTR_ERR(handle); > ret = dquot_acquire(dquot); > @@ -4503,7 +4504,7 @@ static int ext4_release_dquot(struct dquot *dquot) > handle_t *handle; > > handle = ext4_journal_start(dquot_to_inode(dquot), > - EXT4_QUOTA_DEL_BLOCKS(dquot->dq_sb)); > + EXT4_QUOTA_DEL_BLOCKS(dquot->dq_sb), true); > if (IS_ERR(handle)) { > /* Release dquot anyway to avoid endless cycle in dqput() */ > dquot_release(dquot); For now put 'false' in these quota functions. Because failure here results in a failure of dquot_initialize() which is not tested in most places and thus results in quota miscomputations... Properly handling this would require another set of cleanups. Honza -- Jan Kara SUSE Labs, CR