2010-09-17 05:50:32

by Toshiyuki Okajima

[permalink] [raw]
Subject: [RESEND][PATCH] ext4: create own llseek function to handle 2 maxbytes

From: Toshiyuki Okajima <[email protected]>

If the file has no "EXT4_EXTENTS_FL" flag, the maximum size which can be
written (write systemcall) is different from the maximum size which can be
sought (lseek systemcall).

For example, the following 2 cases show us the differences:
#1: mkfs.ext3 <dev>; mount -t ext4 <dev>
#2: mkfs.ext3 <dev>; tune2fs -Oextent,huge_file <dev>; mount -t ext4 <dev>

Table. the max file size which we can write or seek
at each filesystem feature tuning and file flag setting
+============+===============================+===============================+
| \ File flag| | |
| \ | !EXT4_EXTENTS_FL | EXT4_EXTETNS_FL |
|case \| | |
+------------+-------------------------------+-------------------------------+
| #1 | write: 2194719883264 | write: -------------- |
| | seek: 2199023251456 | seek: -------------- |
+------------+-------------------------------+-------------------------------+
| #2 | write: 4402345721856 | write: 17592186044415 |
| | seek: 17592186044415 | seek: 17592186044415 |
+------------+-------------------------------+-------------------------------+

The differences exist because ext4 has 2 maxbytes which are sb->s_maxbytes
(= extent-mapped maxbytes) and EXT4_SB(sb)->s_bitmap_maxbytes (= block-mapped
maxbytes). Although generic_file_llseek uses only extent-mapped maxbytes.
(llseek of ext4_file_operations is generic_file_llseek which uses
sb->s_maxbytes.)
Therefore we create ext4 llseek function which uses 2 maxbytes.

The new own function originates from generic_file_llseek().
If the file flag, "EXT4_EXTENTS_FL" is not set, the function alters
inode->i_sb->s_maxbytes into EXT4_SB(inode->i_sb)->s_bitmap_maxbytes.

Signed-off-by: Toshiyuki Okajima <[email protected]>
Cc: Andreas Dilger <[email protected]>
---
fs/ext4/dir.c | 2 +-
fs/ext4/ext4.h | 1 +
fs/ext4/file.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index 374510f..ece76fb 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -39,7 +39,7 @@ static int ext4_release_dir(struct inode *inode,
struct file *filp);

const struct file_operations ext4_dir_operations = {
- .llseek = generic_file_llseek,
+ .llseek = ext4_llseek,
.read = generic_read_dir,
.readdir = ext4_readdir, /* we take BKL. needed?*/
.unlocked_ioctl = ext4_ioctl,
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 889ec9d..603f8d2 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1960,6 +1960,7 @@ extern const struct file_operations ext4_dir_operations;
/* file.c */
extern const struct inode_operations ext4_file_inode_operations;
extern const struct file_operations ext4_file_operations;
+extern loff_t ext4_llseek(struct file *file, loff_t offset, int origin);

/* namei.c */
extern const struct inode_operations ext4_dir_inode_operations;
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index ee92b66..4efa6ba 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -130,8 +130,50 @@ static int ext4_file_open(struct inode * inode, struct file * filp)
return dquot_file_open(inode, filp);
}

+/*
+ * ext4_llseek() copied from generic_file_llseek() to handle both
+ * block-mapped and extent-mapped maxbytes values. This should otherwise be
+ * identical with generic_file_llseek().
+ */
+loff_t ext4_llseek(struct file *file, loff_t offset, int origin)
+{
+ struct inode *inode = file->f_mapping->host;
+ loff_t maxbytes;
+
+ if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)))
+ maxbytes = EXT4_SB(inode->i_sb)->s_bitmap_maxbytes;
+ else
+ maxbytes = inode->i_sb->s_maxbytes;
+ mutex_lock(&inode->i_mutex);
+ switch (origin) {
+ case SEEK_END:
+ offset += inode->i_size;
+ break;
+ case SEEK_CUR:
+ if (offset == 0) {
+ mutex_unlock(&inode->i_mutex);
+ return file->f_pos;
+ }
+ offset += file->f_pos;
+ break;
+ }
+
+ if (offset < 0 || offset > maxbytes) {
+ mutex_unlock(&inode->i_mutex);
+ return -EINVAL;
+ }
+
+ if (offset != file->f_pos) {
+ file->f_pos = offset;
+ file->f_version = 0;
+ }
+ mutex_unlock(&inode->i_mutex);
+
+ return offset;
+}
+
const struct file_operations ext4_file_operations = {
- .llseek = generic_file_llseek,
+ .llseek = ext4_llseek,
.read = do_sync_read,
.write = do_sync_write,
.aio_read = generic_file_aio_read,
--
1.5.5.6


2010-10-09 20:38:17

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [RESEND][PATCH] ext4: create own llseek function to handle 2 maxbytes

On Fri, Sep 17, 2010 at 02:32:25PM +0900, Toshiyuki Okajima wrote:
> From: Toshiyuki Okajima <[email protected]>
>
> If the file has no "EXT4_EXTENTS_FL" flag, the maximum size which can be
> written (write systemcall) is different from the maximum size which can be
> sought (lseek systemcall).

Thanks, applied. My apologies for the delay in getting back to you.

BTW, directories are not allowed to grow beyond 2**32 bytes; it's
harmless to allow llseek to offsets beyond that, though. One thing
I'm not sure about is what happens if we use extents (which allow for
offsets greater than 2**32) and use a file system with htree's not
enabled. I'm not sure we have the correct checks to make sure the
directory size doesn't grow beyond 2**32 bytes. This is largely
theoretical, though, since performance of the system will be quite
horrible long before we hit that limit, and I think the main problem
will be with NFSv2. Still, if someone has time, this might be a good
thing to sanity check....

I did rewrite the commit description somewhat. Attached see my
rewrite...

- Ted

ext4: improve llseek error handling for overly large seek offsets

From: Toshiyuki Okajima <[email protected]>

The llseek system call should return EINVAL if passed a seek offset
which results in a write error. What this maximum offset should be
depends on whether or not the huge_file file system feature is set,
and whether or not the file is extent based or not.


If the file has no "EXT4_EXTENTS_FL" flag, the maximum size which can be
written (write systemcall) is different from the maximum size which can be
sought (lseek systemcall).

For example, the following 2 cases demonstrates the differences
between the maximum size which can be written, versus the seek offset
allowed by the llseek system call:

#1: mkfs.ext3 <dev>; mount -t ext4 <dev>
#2: mkfs.ext3 <dev>; tune2fs -Oextent,huge_file <dev>; mount -t ext4 <dev>

Table. the max file size which we can write or seek
at each filesystem feature tuning and file flag setting
+============+===============================+===============================+
| \ File flag| | |
| \ | !EXT4_EXTENTS_FL | EXT4_EXTETNS_FL |
|case \| | |
+------------+-------------------------------+-------------------------------+
| #1 | write: 2194719883264 | write: -------------- |
| | seek: 2199023251456 | seek: -------------- |
+------------+-------------------------------+-------------------------------+
| #2 | write: 4402345721856 | write: 17592186044415 |
| | seek: 17592186044415 | seek: 17592186044415 |
+------------+-------------------------------+-------------------------------+

The differences exist because ext4 has 2 maxbytes which are sb->s_maxbytes
(= extent-mapped maxbytes) and EXT4_SB(sb)->s_bitmap_maxbytes (= block-mapped
maxbytes). Although generic_file_llseek uses only extent-mapped maxbytes.
(llseek of ext4_file_operations is generic_file_llseek which uses
sb->s_maxbytes.)

Therefore we create ext4 llseek function which uses 2 maxbytes.

The new own function originates from generic_file_llseek().
If the file flag, "EXT4_EXTENTS_FL" is not set, the function alters
inode->i_sb->s_maxbytes into EXT4_SB(inode->i_sb)->s_bitmap_maxbytes.

Signed-off-by: Toshiyuki Okajima <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Cc: Andreas Dilger <[email protected]>