2014-01-25 07:01:17

by Eryu Guan

[permalink] [raw]
Subject: [PATCH] ext4: don't remove reserved inodes in ext4_unlink()

Corrupted ext4_dir_entry_2 struct on disk may have wrong inode number,
when the inode number is 8 (EXT4_JOURNAL_INO) and the file is deleted,
the journal inode is gone, and unmounting such a fs could trigger the
following BUG_ON() in start_this_handle().

BUG_ON(journal->j_flags & JBD2_UNMOUNT);

------------[ cut here ]------------
kernel BUG at fs/jbd2/transaction.c:307!
...
CPU: 1 PID: 1535 Comm: umount Not tainted 3.13.0+ #14
...
Call Trace:
[<ffffffff8119f17a>] ? kmem_cache_alloc+0x1ca/0x1f0
[<ffffffff812850f0>] ? jbd2__journal_start+0x90/0x1e0
[<ffffffff81285153>] jbd2__journal_start+0xf3/0x1e0
[<ffffffff81242a62>] ? ext4_evict_inode+0x1b2/0x4f0
[<ffffffff8126d039>] __ext4_journal_start_sb+0x69/0xe0
[<ffffffff81242a62>] ext4_evict_inode+0x1b2/0x4f0
[<ffffffff811d3b8e>] evict+0x9e/0x190
[<ffffffff811d4373>] iput+0xf3/0x180
[<ffffffff8128f301>] jbd2_journal_destroy+0x191/0x220
[<ffffffff810b0ae0>] ? abort_exclusive_wait+0xb0/0xb0
[<ffffffff8125d004>] ext4_put_super+0x64/0x340
[<ffffffff811bbae2>] generic_shutdown_super+0x72/0xf0
[<ffffffff811bbd77>] kill_block_super+0x27/0x70
[<ffffffff811bc05d>] deactivate_locked_super+0x3d/0x60
[<ffffffff811bc606>] deactivate_super+0x46/0x60
[<ffffffff811d7f47>] mntput_no_expire+0xa7/0x140
[<ffffffff811d939e>] SyS_umount+0x8e/0x100
[<ffffffff81690c29>] system_call_fastpath+0x16/0x1b

Check inode number in ext4_unlink() and return error if the inode number
is reserved or nonexistent.

Tested by removing a reserved inode and unmounting the fs. Inodes 1-10
have been tested. Also tested by xfstests.

Cc: "Theodore Ts'o" <[email protected]>
Signed-off-by: Eryu Guan <[email protected]>
---

If this was the right way to go, I'll send another patch for ext3 too.

fs/ext4/namei.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 5a0408d..797ce98 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2761,9 +2761,11 @@ end_rmdir:
static int ext4_unlink(struct inode *dir, struct dentry *dentry)
{
int retval;
+ unsigned long ino;
struct inode *inode;
struct buffer_head *bh;
struct ext4_dir_entry_2 *de;
+ struct super_block *sb;
handle_t *handle = NULL;

trace_ext4_unlink_enter(dir, dentry);
@@ -2778,13 +2780,20 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry)
goto end_unlink;

inode = dentry->d_inode;
+ ino = inode->i_ino;
+ sb = dir->i_sb;

retval = -EIO;
- if (le32_to_cpu(de->inode) != inode->i_ino)
+ if (le32_to_cpu(de->inode) != ino)
goto end_unlink;
+ if (ino < EXT4_FIRST_INO(sb) ||
+ ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count)) {
+ ext4_error(sb, "reserved or nonexistent inode %lu", ino);
+ goto end_unlink;
+ }

handle = ext4_journal_start(dir, EXT4_HT_DIR,
- EXT4_DATA_TRANS_BLOCKS(dir->i_sb));
+ EXT4_DATA_TRANS_BLOCKS(sb));
if (IS_ERR(handle)) {
retval = PTR_ERR(handle);
handle = NULL;
--
1.8.5.3



2014-02-12 16:38:31

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: don't remove reserved inodes in ext4_unlink()

On Sat, Jan 25, 2014 at 02:58:17PM +0800, Eryu Guan wrote:
> Corrupted ext4_dir_entry_2 struct on disk may have wrong inode number,
> when the inode number is 8 (EXT4_JOURNAL_INO) and the file is deleted,
> the journal inode is gone, and unmounting such a fs could trigger the
> following BUG_ON() in start_this_handle().....

This patch is mostly good, but you need to exempt the root inode.
Otherwise, the following program, which would normally give the error
"unlink: Is a directory", will mark the file system as containing an
error, and so it could allow an unprivileged user to remount the file
system read-only, or force the system to panic and reboot.

#include <unistd.h>

int main(int argc, char **argv)
{

if (unlink("/") < 0)
perror("unlink");
return 0;
}

Cheers,

- Ted

2014-02-14 05:04:34

by Eryu Guan

[permalink] [raw]
Subject: Re: [PATCH] ext4: don't remove reserved inodes in ext4_unlink()

On Wed, Feb 12, 2014 at 11:38:25AM -0500, Theodore Ts'o wrote:
> On Sat, Jan 25, 2014 at 02:58:17PM +0800, Eryu Guan wrote:
> > Corrupted ext4_dir_entry_2 struct on disk may have wrong inode number,
> > when the inode number is 8 (EXT4_JOURNAL_INO) and the file is deleted,
> > the journal inode is gone, and unmounting such a fs could trigger the
> > following BUG_ON() in start_this_handle().....
>
> This patch is mostly good, but you need to exempt the root inode.
> Otherwise, the following program, which would normally give the error
> "unlink: Is a directory", will mark the file system as containing an
> error, and so it could allow an unprivileged user to remount the file
> system read-only, or force the system to panic and reboot.

Hi Ted,

I dont' see how the following program could mark the file system as
containing an error on patched kernel. I tried running it on both
patched/unpatched kernel, and it gave "unlink: Is a directory" in both
cases. And dumpe2fs -h didn't show that fs contained errors.

I traced do_unlinkat(), vfs_unlink() and ext4_unlink() in systemtap
and systemtap showed only do_unlinkat() was being called.

Am I missing something here?

Thanks,
Eryu
>
> #include <unistd.h>
>
> int main(int argc, char **argv)
> {
>
> if (unlink("/") < 0)
> perror("unlink");
> return 0;
> }
>
> Cheers,
>
> - Ted

2014-10-12 08:51:21

by Eryu Guan

[permalink] [raw]
Subject: [PATCH v2] ext4: don't remove reserved inodes in ext4_unlink()

Corrupted ext4_dir_entry_2 struct on disk may have wrong inode number,
when the inode number is 8 (EXT4_JOURNAL_INO) and the file is deleted,
the journal inode is gone, and unmounting such a fs could trigger the
following BUG_ON() in start_this_handle().

BUG_ON(journal->j_flags & JBD2_UNMOUNT);

------------[ cut here ]------------
kernel BUG at fs/jbd2/transaction.c:307!
...
CPU: 1 PID: 1535 Comm: umount Not tainted 3.13.0+ #14
...
Call Trace:
[<ffffffff8119f17a>] ? kmem_cache_alloc+0x1ca/0x1f0
[<ffffffff812850f0>] ? jbd2__journal_start+0x90/0x1e0
[<ffffffff81285153>] jbd2__journal_start+0xf3/0x1e0
[<ffffffff81242a62>] ? ext4_evict_inode+0x1b2/0x4f0
[<ffffffff8126d039>] __ext4_journal_start_sb+0x69/0xe0
[<ffffffff81242a62>] ext4_evict_inode+0x1b2/0x4f0
[<ffffffff811d3b8e>] evict+0x9e/0x190
[<ffffffff811d4373>] iput+0xf3/0x180
[<ffffffff8128f301>] jbd2_journal_destroy+0x191/0x220
[<ffffffff810b0ae0>] ? abort_exclusive_wait+0xb0/0xb0
[<ffffffff8125d004>] ext4_put_super+0x64/0x340
[<ffffffff811bbae2>] generic_shutdown_super+0x72/0xf0
[<ffffffff811bbd77>] kill_block_super+0x27/0x70
[<ffffffff811bc05d>] deactivate_locked_super+0x3d/0x60
[<ffffffff811bc606>] deactivate_super+0x46/0x60
[<ffffffff811d7f47>] mntput_no_expire+0xa7/0x140
[<ffffffff811d939e>] SyS_umount+0x8e/0x100
[<ffffffff81690c29>] system_call_fastpath+0x16/0x1b

Check inode number in ext4_unlink() and return error if the inode number
is reserved or nonexistent(except EXT4_ROOT_INO, as Ted pointed out that
it's a security hole).

Tested by removing a reserved inode(modify the ondisk structure by hand)
and unmounting the fs. Inodes 1-10 have been tested. Also tested by
xfstests.

Signed-off-by: Eryu Guan <[email protected]>
---

(This is a v2 of an old patch, I forgot about the patch..)

v2: exempt the root inode as Ted suggested, although unlink("/") would be
catched by vfs and unlink a corrupt file with root inode number would be
catched by ext4_lookup, and won't reach ext4_unlink() in both cases

EXT4-fs error (device loop0): ext4_lookup:1441: inode #2: comm rm: 'testfile' linked to parent dir
Aborting journal on device loop0-8.
EXT4-fs (loop0): Remounting filesystem read-only
EXT4-fs error (device loop0): ext4_lookup:1441: inode #2: comm rm: 'testfile' linked to parent dir

fs/ext4/namei.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 603e4eb..6e6b312 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2796,9 +2796,11 @@ end_rmdir:
static int ext4_unlink(struct inode *dir, struct dentry *dentry)
{
int retval;
+ unsigned long ino;
struct inode *inode;
struct buffer_head *bh;
struct ext4_dir_entry_2 *de;
+ struct super_block *sb;
handle_t *handle = NULL;

trace_ext4_unlink_enter(dir, dentry);
@@ -2815,13 +2817,20 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry)
goto end_unlink;

inode = dentry->d_inode;
+ ino = inode->i_ino;
+ sb = dir->i_sb;

retval = -EIO;
- if (le32_to_cpu(de->inode) != inode->i_ino)
+ if (le32_to_cpu(de->inode) != ino)
goto end_unlink;
+ if ((ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO) ||
+ ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count)) {
+ ext4_error(sb, "reserved or nonexistent inode %lu", ino);
+ goto end_unlink;
+ }

handle = ext4_journal_start(dir, EXT4_HT_DIR,
- EXT4_DATA_TRANS_BLOCKS(dir->i_sb));
+ EXT4_DATA_TRANS_BLOCKS(sb));
if (IS_ERR(handle)) {
retval = PTR_ERR(handle);
handle = NULL;
--
1.8.3.1


2014-10-13 16:05:03

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH v2] ext4: don't remove reserved inodes in ext4_unlink()

On Sun, Oct 12, 2014 at 04:50:58PM +0800, Eryu Guan wrote:
> Corrupted ext4_dir_entry_2 struct on disk may have wrong inode number,
> when the inode number is 8 (EXT4_JOURNAL_INO) and the file is deleted,
> the journal inode is gone, and unmounting such a fs could trigger the
> following BUG_ON() in start_this_handle().
>
> BUG_ON(journal->j_flags & JBD2_UNMOUNT);
>
> ------------[ cut here ]------------
> kernel BUG at fs/jbd2/transaction.c:307!
> ...
> CPU: 1 PID: 1535 Comm: umount Not tainted 3.13.0+ #14
> ...
> Call Trace:
> [<ffffffff8119f17a>] ? kmem_cache_alloc+0x1ca/0x1f0
> [<ffffffff812850f0>] ? jbd2__journal_start+0x90/0x1e0
> [<ffffffff81285153>] jbd2__journal_start+0xf3/0x1e0
> [<ffffffff81242a62>] ? ext4_evict_inode+0x1b2/0x4f0
> [<ffffffff8126d039>] __ext4_journal_start_sb+0x69/0xe0
> [<ffffffff81242a62>] ext4_evict_inode+0x1b2/0x4f0
> [<ffffffff811d3b8e>] evict+0x9e/0x190
> [<ffffffff811d4373>] iput+0xf3/0x180
> [<ffffffff8128f301>] jbd2_journal_destroy+0x191/0x220
> [<ffffffff810b0ae0>] ? abort_exclusive_wait+0xb0/0xb0
> [<ffffffff8125d004>] ext4_put_super+0x64/0x340
> [<ffffffff811bbae2>] generic_shutdown_super+0x72/0xf0
> [<ffffffff811bbd77>] kill_block_super+0x27/0x70
> [<ffffffff811bc05d>] deactivate_locked_super+0x3d/0x60
> [<ffffffff811bc606>] deactivate_super+0x46/0x60
> [<ffffffff811d7f47>] mntput_no_expire+0xa7/0x140
> [<ffffffff811d939e>] SyS_umount+0x8e/0x100
> [<ffffffff81690c29>] system_call_fastpath+0x16/0x1b
>
> Check inode number in ext4_unlink() and return error if the inode number
> is reserved or nonexistent(except EXT4_ROOT_INO, as Ted pointed out that
> it's a security hole).
>
> Tested by removing a reserved inode(modify the ondisk structure by hand)
> and unmounting the fs. Inodes 1-10 have been tested. Also tested by
> xfstests.
>
> Signed-off-by: Eryu Guan <[email protected]>

Looks reasonable to me, you can add Reviewed-by if you like.

--D

> ---
>
> (This is a v2 of an old patch, I forgot about the patch..)
>
> v2: exempt the root inode as Ted suggested, although unlink("/") would be
> catched by vfs and unlink a corrupt file with root inode number would be
> catched by ext4_lookup, and won't reach ext4_unlink() in both cases
>
> EXT4-fs error (device loop0): ext4_lookup:1441: inode #2: comm rm: 'testfile' linked to parent dir
> Aborting journal on device loop0-8.
> EXT4-fs (loop0): Remounting filesystem read-only
> EXT4-fs error (device loop0): ext4_lookup:1441: inode #2: comm rm: 'testfile' linked to parent dir
>
> fs/ext4/namei.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 603e4eb..6e6b312 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -2796,9 +2796,11 @@ end_rmdir:
> static int ext4_unlink(struct inode *dir, struct dentry *dentry)
> {
> int retval;
> + unsigned long ino;
> struct inode *inode;
> struct buffer_head *bh;
> struct ext4_dir_entry_2 *de;
> + struct super_block *sb;
> handle_t *handle = NULL;
>
> trace_ext4_unlink_enter(dir, dentry);
> @@ -2815,13 +2817,20 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry)
> goto end_unlink;
>
> inode = dentry->d_inode;
> + ino = inode->i_ino;
> + sb = dir->i_sb;
>
> retval = -EIO;
> - if (le32_to_cpu(de->inode) != inode->i_ino)
> + if (le32_to_cpu(de->inode) != ino)
> goto end_unlink;
> + if ((ino < EXT4_FIRST_INO(sb) && ino != EXT4_ROOT_INO) ||
> + ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count)) {
> + ext4_error(sb, "reserved or nonexistent inode %lu", ino);
> + goto end_unlink;
> + }
>
> handle = ext4_journal_start(dir, EXT4_HT_DIR,
> - EXT4_DATA_TRANS_BLOCKS(dir->i_sb));
> + EXT4_DATA_TRANS_BLOCKS(sb));
> if (IS_ERR(handle)) {
> retval = PTR_ERR(handle);
> handle = NULL;
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2014-10-13 16:21:07

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH v2] ext4: don't remove reserved inodes in ext4_unlink()

On Sun, Oct 12, 2014 at 04:50:58PM +0800, Eryu Guan wrote:
> Corrupted ext4_dir_entry_2 struct on disk may have wrong inode number,
> when the inode number is 8 (EXT4_JOURNAL_INO) and the file is deleted,
> the journal inode is gone, and unmounting such a fs could trigger the
> following BUG_ON() in start_this_handle()....
>

I believe the bug that this patch is trying to fix has been addressed
by this commit:

http://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git/commit/?h=dev&id=bf8ad98e1bffa5ce178ef5e4ea803a86ac30f9e5

ext4: add ext4_iget_normal() which is to be used for dir tree lookups
If there is a corrupted file system which has directory entries that
point at reserved, metadata inodes, prohibit them from being used by
treating them the same way we treat Boot Loader inodes --- that is,
mark them to be bad inodes. This prohibits them from being opened,
deleted, or modified via chmod, chown, utimes, etc.

In particular, this prevents a corrupted file system which has a
directory entry which points at the journal inode from being deleted
and its blocks released, after which point Much Hilarity Ensues.

Reported-by: Sami Liedes <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected]

- Ted



2014-10-14 03:19:14

by Eryu Guan

[permalink] [raw]
Subject: Re: [PATCH v2] ext4: don't remove reserved inodes in ext4_unlink()

On Mon, Oct 13, 2014 at 12:21:00PM -0400, Theodore Ts'o wrote:
> On Sun, Oct 12, 2014 at 04:50:58PM +0800, Eryu Guan wrote:
> > Corrupted ext4_dir_entry_2 struct on disk may have wrong inode number,
> > when the inode number is 8 (EXT4_JOURNAL_INO) and the file is deleted,
> > the journal inode is gone, and unmounting such a fs could trigger the
> > following BUG_ON() in start_this_handle()....
> >
>
> I believe the bug that this patch is trying to fix has been addressed
> by this commit:
>
> http://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git/commit/?h=dev&id=bf8ad98e1bffa5ce178ef5e4ea803a86ac30f9e5

Yes, this patch fixes the issue I'm seeing, thanks for pointing it out!

I have one concern thouth, removing a reserved inode (I tested
EXT4_JOURNAL_INO) on corrupted ext4 returns EIO as expect but the fs
is not marked as containing error(as other EIOs in ext4_iget()) and no
error logs in dmesg. User may have no idea what happened and the
corruped fs is still being used as normal. I think EXT4_ERROR_INODE
should be called too somewhere in such case.

Thanks,
Eryu
>
> ext4: add ext4_iget_normal() which is to be used for dir tree lookups
> If there is a corrupted file system which has directory entries that
> point at reserved, metadata inodes, prohibit them from being used by
> treating them the same way we treat Boot Loader inodes --- that is,
> mark them to be bad inodes. This prohibits them from being opened,
> deleted, or modified via chmod, chown, utimes, etc.
>
> In particular, this prevents a corrupted file system which has a
> directory entry which points at the journal inode from being deleted
> and its blocks released, after which point Much Hilarity Ensues.
>
> Reported-by: Sami Liedes <[email protected]>
> Signed-off-by: Theodore Ts'o <[email protected]>
> Cc: [email protected]
>
> - Ted
>
>