2007-06-13 20:56:57

by Rich Coe

[permalink] [raw]
Subject: PATCH: udf fs corruption on linux-2.6

Hi Linus,

This patch fixes directory and missing files corruption in fs/udf which
occurs on all known 2.6 releases.

The corruption occurs because blocks which were pre-alloc'd for a directory
are released back to the fs freelist, but the inode's alloc block information
is not updated to reflect this.

You would not see corruption if the number of files in any directory is
less than 41, because the pre-alloc routine does not allocate blocks for the
directory until the number of files is over 40.

The problem occurs during unmounting because fs/udf incorrectly calls
udf_discard_prealloc() from udf_clear_inode(). udf_discard_prealloc() will
update the inode and schedule it for write, but no write will ever occur
because the fs is in the process of being umount'd.

The solution is to add a put_inode routine to update the inode contents
and release the pre-alloc'd blocks to disk prior to clearing the inode
from the kernel.

Test case:
mkuddfs /dev/scd0
mount -o sync /dev/scd0 /mnt/cdrom
mkdir /mnt/cdrom/A /mnt/cdrom/B
cp A/* /mnt/cdrom/A [ A contains 90 files of various sizes ]
cp B/* /mnt/cdrom/B [ B contains 20 or fewer files ]
umount /mnt/cdrom

Here you can see how 7 blocks starting at sector 139 are free and listed in the
directory entry for 'A'. I used udfdump to get the following information:
[ ... ]
Free space found on this partition
[00000139 - 00000159] [00000161 - 00000191] [00000193 - 00000223]
[00004464 - 00004475] [00004485 - 00524286] [00524287 - 01048573]
[01048574 - 01572860] [01572861 - 02097147] [02097148 - 02235039]
[ ... ]
Filename `A`
[ ... ]
[ blob at sector 2038 for 2048 bytes in logical partion 0 ]
[ blob at sector 137 for 4096 bytes in logical partion 0 ]
[ blob at sector 139 for 14336 bytes in logical partion 0 flags 1 ]

--
Rich Coe [email protected]
Virtual Principle Engineer General Electric Healthcare Technologies
Global Software Platforms, Computer Technology Team

Signed-off-by: Rich Coe <[email protected]>
---
diff -urNp linux-2.6.20.orig/fs/udf/inode.c linux-2.6.20/fs/udf/inode.c
--- linux-2.6.20.orig/fs/udf/inode.c 2007-02-04 12:44:54.000000000 -0600
+++ linux-2.6.20/fs/udf/inode.c 2007-06-13 11:32:41.930983471 -0500
@@ -102,14 +102,17 @@ no_delete:

void udf_clear_inode(struct inode *inode)
{
+ kfree(UDF_I_DATA(inode));
+ UDF_I_DATA(inode) = NULL;
+}
+
+void udf_put_inode(struct inode *inode)
+{
if (!(inode->i_sb->s_flags & MS_RDONLY)) {
lock_kernel();
udf_discard_prealloc(inode);
unlock_kernel();
}
-
- kfree(UDF_I_DATA(inode));
- UDF_I_DATA(inode) = NULL;
}

static int udf_writepage(struct page *page, struct writeback_control *wbc)
diff -urNp linux-2.6.20.orig/fs/udf/super.c linux-2.6.20/fs/udf/super.c
--- linux-2.6.20.orig/fs/udf/super.c 2007-02-04 12:44:54.000000000 -0600
+++ linux-2.6.20/fs/udf/super.c 2007-06-13 11:31:23.793017185 -0500
@@ -166,6 +166,7 @@ static struct super_operations udf_sb_op
.write_inode = udf_write_inode,
.delete_inode = udf_delete_inode,
.clear_inode = udf_clear_inode,
+ .put_inode = udf_put_inode,
.put_super = udf_put_super,
.write_super = udf_write_super,
.statfs = udf_statfs,
diff -urNp linux-2.6.20.orig/fs/udf/udfdecl.h linux-2.6.20/fs/udf/udfdecl.h
--- linux-2.6.20.orig/fs/udf/udfdecl.h 2007-02-04 12:44:54.000000000 -0600
+++ linux-2.6.20/fs/udf/udfdecl.h 2007-06-13 11:31:16.684293910 -0500
@@ -97,6 +97,7 @@ extern void udf_truncate(struct inode *)
extern void udf_read_inode(struct inode *);
extern void udf_delete_inode(struct inode *);
extern void udf_clear_inode(struct inode *);
+extern void udf_put_inode(struct inode *);
extern int udf_write_inode(struct inode *, int);
extern long udf_block_map(struct inode *, long);
extern int8_t inode_bmap(struct inode *, int, kernel_lb_addr *, uint32_t *, kernel_lb_addr *, uint32_t *, uint32_t *, struct buffer_head **);


2007-06-13 21:27:45

by Chuck Ebbert

[permalink] [raw]
Subject: Re: PATCH: udf fs corruption on linux-2.6

On 06/13/2007 04:48 PM, Rich Coe wrote:
> Hi Linus,
>
> This patch fixes directory and missing files corruption in fs/udf which
> occurs on all known 2.6 releases.
>
> The corruption occurs because blocks which were pre-alloc'd for a directory
> are released back to the fs freelist, but the inode's alloc block information
> is not updated to reflect this.
>
> You would not see corruption if the number of files in any directory is
> less than 41, because the pre-alloc routine does not allocate blocks for the
> directory until the number of files is over 40.
>
> The problem occurs during unmounting because fs/udf incorrectly calls
> udf_discard_prealloc() from udf_clear_inode(). udf_discard_prealloc() will
> update the inode and schedule it for write, but no write will ever occur
> because the fs is in the process of being umount'd.
>
> The solution is to add a put_inode routine to update the inode contents
> and release the pre-alloc'd blocks to disk prior to clearing the inode
> from the kernel.

http://lkml.org/lkml/2007/6/11/79

2007-06-14 08:13:07

by Jan Kara

[permalink] [raw]
Subject: Re: PATCH: udf fs corruption on linux-2.6

Hi Rich,

On Wed 13-06-07 15:48:03, Rich Coe wrote:
> This patch fixes directory and missing files corruption in fs/udf which
> occurs on all known 2.6 releases.
>
> The corruption occurs because blocks which were pre-alloc'd for a directory
> are released back to the fs freelist, but the inode's alloc block information
> is not updated to reflect this.
>
> You would not see corruption if the number of files in any directory is
> less than 41, because the pre-alloc routine does not allocate blocks for the
> directory until the number of files is over 40.
>
> The problem occurs during unmounting because fs/udf incorrectly calls
> udf_discard_prealloc() from udf_clear_inode(). udf_discard_prealloc() will
> update the inode and schedule it for write, but no write will ever occur
> because the fs is in the process of being umount'd.
>
> The solution is to add a put_inode routine to update the inode contents
> and release the pre-alloc'd blocks to disk prior to clearing the inode
> from the kernel.
>
> Test case:
> mkuddfs /dev/scd0
> mount -o sync /dev/scd0 /mnt/cdrom
> mkdir /mnt/cdrom/A /mnt/cdrom/B
> cp A/* /mnt/cdrom/A [ A contains 90 files of various sizes ]
> cp B/* /mnt/cdrom/B [ B contains 20 or fewer files ]
> umount /mnt/cdrom
>
> Here you can see how 7 blocks starting at sector 139 are free and listed in the
> directory entry for 'A'. I used udfdump to get the following information:
> [ ... ]
> Free space found on this partition
> [00000139 - 00000159] [00000161 - 00000191] [00000193 - 00000223]
> [00004464 - 00004475] [00004485 - 00524286] [00524287 - 01048573]
> [01048574 - 01572860] [01572861 - 02097147] [02097148 - 02235039]
> [ ... ]
> Filename `A`
> [ ... ]
> [ blob at sector 2038 for 2048 bytes in logical partion 0 ]
> [ blob at sector 137 for 4096 bytes in logical partion 0 ]
> [ blob at sector 139 for 14336 bytes in logical partion 0 flags 1 ]
>
> --
> Rich Coe [email protected]
> Virtual Principle Engineer General Electric Healthcare Technologies
> Global Software Platforms, Computer Technology Team
>
> Signed-off-by: Rich Coe <[email protected]>
> ---
> diff -urNp linux-2.6.20.orig/fs/udf/inode.c linux-2.6.20/fs/udf/inode.c
> --- linux-2.6.20.orig/fs/udf/inode.c 2007-02-04 12:44:54.000000000 -0600
> +++ linux-2.6.20/fs/udf/inode.c 2007-06-13 11:32:41.930983471 -0500
> @@ -102,14 +102,17 @@ no_delete:
>
> void udf_clear_inode(struct inode *inode)
> {
> + kfree(UDF_I_DATA(inode));
> + UDF_I_DATA(inode) = NULL;
> +}
> +
> +void udf_put_inode(struct inode *inode)
> +{
> if (!(inode->i_sb->s_flags & MS_RDONLY)) {
> lock_kernel();
> udf_discard_prealloc(inode);
> unlock_kernel();
> }
Calling udf_discard_preallloc() from put_inode() has the problem that you
truncate the last extent too early - see the comments in the patch Chuck
pointed too (http://lkml.org/lkml/2007/6/11/79).
Can you try whether that patch fixes your problems?

Thanks
Honza
--
Jan Kara <[email protected]>
SuSE CR Labs

2007-06-14 21:32:40

by Rich Coe

[permalink] [raw]
Subject: RFC: Re: PATCH: udf fs corruption on linux-2.6

I've updated the patch below to use drop_inode rather than put_inode.

drop_inode is only called when the last iput() reference to the inode is
released, where put_inode is called for every iput().

Rich

On Wed, 13 Jun 2007 15:48:03 -0500
Rich Coe <[email protected]> wrote:
> Hi Linus,
>
> This patch fixes directory and missing files corruption in fs/udf which
> occurs on all known 2.6 releases.
>
> The corruption occurs because blocks which were pre-alloc'd for a directory
> are released back to the fs freelist, but the inode's alloc block information
> is not updated to reflect this.
>
> You would not see corruption if the number of files in any directory is
> less than 41, because the pre-alloc routine does not allocate blocks for the
> directory until the number of files is over 40.
>
> The problem occurs during unmounting because fs/udf incorrectly calls
> udf_discard_prealloc() from udf_clear_inode(). udf_discard_prealloc() will
> update the inode and schedule it for write, but no write will ever occur
> because the fs is in the process of being umount'd.
>
> The solution is to add a put_inode routine to update the inode contents
> and release the pre-alloc'd blocks to disk prior to clearing the inode
> from the kernel.
>
> Test case:
> mkuddfs /dev/scd0
> mount -o sync /dev/scd0 /mnt/cdrom
> mkdir /mnt/cdrom/A /mnt/cdrom/B
> cp A/* /mnt/cdrom/A [ A contains 90 files of various sizes ]
> cp B/* /mnt/cdrom/B [ B contains 20 or fewer files ]
> umount /mnt/cdrom
>
> Here you can see how 7 blocks starting at sector 139 are free and listed in the
> directory entry for 'A'. I used udfdump to get the following information:
> [ ... ]
> Free space found on this partition
> [00000139 - 00000159] [00000161 - 00000191] [00000193 - 00000223]
> [00004464 - 00004475] [00004485 - 00524286] [00524287 - 01048573]
> [01048574 - 01572860] [01572861 - 02097147] [02097148 - 02235039]
> [ ... ]
> Filename `A`
> [ ... ]
> [ blob at sector 2038 for 2048 bytes in logical partion 0 ]
> [ blob at sector 137 for 4096 bytes in logical partion 0 ]
> [ blob at sector 139 for 14336 bytes in logical partion 0 flags 1 ]
>
> --
> Rich Coe [email protected]
> Virtual Principle Engineer General Electric Healthcare Technologies
> Global Software Platforms, Computer Technology Team

Signed-off-by: Rich Coe <[email protected]>
---
diff -urNp linux-2.6.20.orig/fs/udf/inode.c linux-2.6.20/fs/udf/inode.c
--- linux-2.6.20.orig/fs/udf/inode.c 2007-02-04 12:44:54.000000000 -0600
+++ linux-2.6.20/fs/udf/inode.c 2007-06-13 11:32:41.930983471 -0500
@@ -102,14 +102,17 @@ no_delete:

void udf_clear_inode(struct inode *inode)
{
+ kfree(UDF_I_DATA(inode));
+ UDF_I_DATA(inode) = NULL;
+}
+
+void udf_put_inode(struct inode *inode)
+{
if (!(inode->i_sb->s_flags & MS_RDONLY)) {
lock_kernel();
udf_discard_prealloc(inode);
unlock_kernel();
}
-
- kfree(UDF_I_DATA(inode));
- UDF_I_DATA(inode) = NULL;
}

static int udf_writepage(struct page *page, struct writeback_control *wbc)
diff -urNp linux-2.6.20.orig/fs/udf/super.c linux-2.6.20/fs/udf/super.c
--- linux-2.6.20.orig/fs/udf/super.c 2007-02-04 12:44:54.000000000 -0600
+++ linux-2.6.20/fs/udf/super.c 2007-06-13 11:31:23.793017185 -0500
@@ -166,6 +166,7 @@ static struct super_operations udf_sb_op
.write_inode = udf_write_inode,
.delete_inode = udf_delete_inode,
.clear_inode = udf_clear_inode,
+ .drop_inode = udf_put_inode,
.put_super = udf_put_super,
.write_super = udf_write_super,
.statfs = udf_statfs,
diff -urNp linux-2.6.20.orig/fs/udf/udfdecl.h linux-2.6.20/fs/udf/udfdecl.h
--- linux-2.6.20.orig/fs/udf/udfdecl.h 2007-02-04 12:44:54.000000000 -0600
+++ linux-2.6.20/fs/udf/udfdecl.h 2007-06-13 11:31:16.684293910 -0500
@@ -97,6 +97,7 @@ extern void udf_truncate(struct inode *)
extern void udf_read_inode(struct inode *);
extern void udf_delete_inode(struct inode *);
extern void udf_clear_inode(struct inode *);
+extern void udf_put_inode(struct inode *);
extern int udf_write_inode(struct inode *, int);
extern long udf_block_map(struct inode *, long);
extern int8_t inode_bmap(struct inode *, int, kernel_lb_addr *, uint32_t *, kernel_lb_addr *, uint32_t *, uint32_t *, struct buffer_head **);

2007-06-14 22:16:00

by Linus Torvalds

[permalink] [raw]
Subject: Re: RFC: Re: PATCH: udf fs corruption on linux-2.6



On Thu, 14 Jun 2007, Rich Coe wrote:
>
> I've updated the patch below to use drop_inode rather than put_inode.
>
> drop_inode is only called when the last iput() reference to the inode is
> released, where put_inode is called for every iput().

Patch looks fine, but this late in the -rc series, I'd really like to get
an ACK from Jan or somebody else, just to make sure there are no other
issues with it.

Jan?

Linus

2007-06-15 15:56:44

by Jan Kara

[permalink] [raw]
Subject: Re: RFC: Re: PATCH: udf fs corruption on linux-2.6

On Thu 14-06-07 15:12:58, Linus Torvalds wrote:
>
>
> On Thu, 14 Jun 2007, Rich Coe wrote:
> >
> > I've updated the patch below to use drop_inode rather than put_inode.
> >
> > drop_inode is only called when the last iput() reference to the inode is
> > released, where put_inode is called for every iput().
>
> Patch looks fine, but this late in the -rc series, I'd really like to get
> an ACK from Jan or somebody else, just to make sure there are no other
> issues with it.
>
> Jan?
My fix for this problem is already sitting in Andrew's patch queue
(http://lkml.org/lkml/2007/6/11/79). Rich's patch still has a problem - you
cannot call udf_discard_prealloc() from drop_inode() because it is called
under inode_lock and thus you cannot call e.g. mark_inode_dirty(). I've done
that mistake too ;). So please don't apply the patch.

Honza
--
Jan Kara <[email protected]>
SuSE CR Labs

2007-06-15 23:24:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: RFC: Re: PATCH: udf fs corruption on linux-2.6



On Fri, 15 Jun 2007, Jan Kara wrote:
>
> My fix for this problem is already sitting in Andrew's patch queue
> (http://lkml.org/lkml/2007/6/11/79). Rich's patch still has a problem - you
> cannot call udf_discard_prealloc() from drop_inode() because it is called
> under inode_lock and thus you cannot call e.g. mark_inode_dirty(). I've done
> that mistake too ;). So please don't apply the patch.

Ok, dropped, will depend on Andrew to forward it to me. It may miss the
(already late) -rc5 I was hoping to release today, but I'll cc Andrew on
this, and maybe I'll get disorganized and just end up delaying -rc5 even
more ;)

Linus