From: Andreas Dilger Subject: Re: [PATCH -v2] ext4: add max_dir_size_kb mount option Date: Fri, 10 Aug 2012 21:22:39 -0600 Message-ID: References: <20120810215811.GA1137@thunk.org> <1344649235-9289-1-git-send-email-tytso@mit.edu> Mime-Version: 1.0 (1.0) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT Cc: Ext4 Developers List , Theodore Ts'o To: Theodore Ts'o Return-path: Received: from mail-pb0-f46.google.com ([209.85.160.46]:61536 "EHLO mail-pb0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751060Ab2HKDWk convert rfc822-to-8bit (ORCPT ); Fri, 10 Aug 2012 23:22:40 -0400 Received: by pbbrr13 with SMTP id rr13so3747001pbb.19 for ; Fri, 10 Aug 2012 20:22:39 -0700 (PDT) In-Reply-To: <1344649235-9289-1-git-send-email-tytso@mit.edu> Sender: linux-ext4-owner@vger.kernel.org List-ID: On 2012-08-10, at 19:40, Theodore Ts'o wrote: > Very large directories can cause significant performance problems, or > perhaps even invoke the OOM killer, if the process is running in a > highly constrained memory environment (whether it is VM's with a small > amount of memory or in a small memory cgroup). > > So it is useful, in cloud server/data center environments, to be able > to set a filesystem-wide cap on the maximum size of a directory, to > ensure that directories never get larger than a sane size. We do this > via a new mount option, max_dir_size_kb. If there is an attempt to > grow the directory larger than max_dir_size_kb, the system call will > return ENOSPC instead. In our patch, it returns EFBIG, since it isn't really a case of being out of space for blocks or inodes. Cheers, Andreas > Google-Bug-Id: 6863013 > > Signed-off-by: "Theodore Ts'o" > --- > Documentation/filesystems/ext4.txt | 4 ++++ > fs/ext4/ext4.h | 1 + > fs/ext4/namei.c | 7 +++++++ > fs/ext4/super.c | 7 +++++++ > 4 files changed, 19 insertions(+) > > diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt > index 1b7f9ac..43b80b5 100644 > --- a/Documentation/filesystems/ext4.txt > +++ b/Documentation/filesystems/ext4.txt > @@ -375,6 +375,10 @@ dioread_nolock locking. If the dioread_nolock option is specified > Because of the restrictions this options comprises > it is off by default (e.g. dioread_lock). > > +max_dir_size_kb=n This limits the size of directories so that any > + attempt to expand them beyond the specified > + limit in kilobytes will cause an ENOSPC error. > + > i_version Enable 64-bit inode version support. This option is > off by default. > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > index c3411d4..7c0841e 100644 > --- a/fs/ext4/ext4.h > +++ b/fs/ext4/ext4.h > @@ -1243,6 +1243,7 @@ struct ext4_sb_info { > unsigned int s_mb_order2_reqs; > unsigned int s_mb_group_prealloc; > unsigned int s_max_writeback_mb_bump; > + unsigned int s_max_dir_size_kb; > /* where last allocation was done - for stream allocation */ > unsigned long s_mb_last_group; > unsigned long s_mb_last_start; > diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c > index 2a42cc0..7450ff0 100644 > --- a/fs/ext4/namei.c > +++ b/fs/ext4/namei.c > @@ -55,6 +55,13 @@ static struct buffer_head *ext4_append(handle_t *handle, > { > struct buffer_head *bh; > > + if (unlikely(EXT4_SB(inode->i_sb)->s_max_dir_size_kb && > + ((inode->i_size >> 10) >= > + EXT4_SB(inode->i_sb)->s_max_dir_size_kb))) { > + *err = -ENOSPC; > + return NULL; > + } > + > *block = inode->i_size >> inode->i_sb->s_blocksize_bits; > > bh = ext4_bread(handle, inode, *block, 1, err); > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index 56bcaec..5896dcb 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -1230,6 +1230,7 @@ enum { > Opt_inode_readahead_blks, Opt_journal_ioprio, > Opt_dioread_nolock, Opt_dioread_lock, > Opt_discard, Opt_nodiscard, Opt_init_itable, Opt_noinit_itable, > + Opt_max_dir_size_kb, > }; > > static const match_table_t tokens = { > @@ -1303,6 +1304,7 @@ static const match_table_t tokens = { > {Opt_init_itable, "init_itable=%u"}, > {Opt_init_itable, "init_itable"}, > {Opt_noinit_itable, "noinit_itable"}, > + {Opt_max_dir_size_kb, "max_dir_size_kb=%u"}, > {Opt_removed, "check=none"}, /* mount option from ext2/3 */ > {Opt_removed, "nocheck"}, /* mount option from ext2/3 */ > {Opt_removed, "reservation"}, /* mount option from ext2/3 */ > @@ -1483,6 +1485,7 @@ static const struct mount_opts { > {Opt_jqfmt_vfsold, QFMT_VFS_OLD, MOPT_QFMT}, > {Opt_jqfmt_vfsv0, QFMT_VFS_V0, MOPT_QFMT}, > {Opt_jqfmt_vfsv1, QFMT_VFS_V1, MOPT_QFMT}, > + {Opt_max_dir_size_kb, 0, MOPT_GTE0}, > {Opt_err, 0, 0} > }; > > @@ -1598,6 +1601,8 @@ static int handle_mount_opt(struct super_block *sb, char *opt, int token, > if (!args->from) > arg = EXT4_DEF_LI_WAIT_MULT; > sbi->s_li_wait_mult = arg; > + } else if (token == Opt_max_dir_size_kb) { > + sbi->s_max_dir_size_kb = arg; > } else if (token == Opt_stripe) { > sbi->s_stripe = arg; > } else if (m->flags & MOPT_DATAJ) { > @@ -1829,6 +1834,8 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, > if (nodefs || (test_opt(sb, INIT_INODE_TABLE) && > (sbi->s_li_wait_mult != EXT4_DEF_LI_WAIT_MULT))) > SEQ_OPTS_PRINT("init_itable=%u", sbi->s_li_wait_mult); > + if (nodefs || sbi->s_max_dir_size_kb) > + SEQ_OPTS_PRINT("max_dir_size_kb=%u", sbi->s_max_dir_size_kb); > > ext4_show_quota_options(seq, sb); > return 0; > -- > 1.7.12.rc0.22.gcdd159b > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html