2009-11-06 22:29:06

by Eric Sandeen

[permalink] [raw]
Subject: [PATCH 1/2] ext4: journal all modifications in ext4_xattr_set_handle

ext4_xattr_set_handle() was modifying s_inode_size outside
of journaling constraints; this is one of the accesses that
was causing the crc errors in journal replay as seen in
kernel.org bugzilla #14354.

Signed-off-by: Eric Sandeen <[email protected]>
---

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index fed5b01..0257019 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -988,6 +988,10 @@ ext4_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index,
if (error)
goto cleanup;

+ error = ext4_journal_get_write_access(handle, is.iloc.bh);
+ if (error)
+ goto cleanup;
+
if (EXT4_I(inode)->i_state & EXT4_STATE_NEW) {
struct ext4_inode *raw_inode = ext4_raw_inode(&is.iloc);
memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
@@ -1013,9 +1017,6 @@ ext4_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index,
if (flags & XATTR_CREATE)
goto cleanup;
}
- error = ext4_journal_get_write_access(handle, is.iloc.bh);
- if (error)
- goto cleanup;
if (!value) {
if (!is.s.not_found)
error = ext4_xattr_ibody_set(handle, inode, &i, &is);



2009-11-06 22:35:09

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: journal all modifications in ext4_xattr_set_handle

Eric Sandeen wrote:
> ext4_xattr_set_handle() was modifying s_inode_size outside
> of journaling constraints; this is one of the accesses that
> was causing the crc errors in journal replay as seen in
> kernel.org bugzilla #14354.

Oh, and for those who haven't been following the bug, big
thanks to Chris Mason for helping to look into this and coming
up with the debugging patch that made it obvious...

-Eric

> Signed-off-by: Eric Sandeen <[email protected]>
> ---
>
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index fed5b01..0257019 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -988,6 +988,10 @@ ext4_xattr_set_handle(handle_t *handle, struct
> inode *inode, int name_index,
> if (error)
> goto cleanup;
>
> + error = ext4_journal_get_write_access(handle, is.iloc.bh);
> + if (error)
> + goto cleanup;
> +
> if (EXT4_I(inode)->i_state & EXT4_STATE_NEW) {
> struct ext4_inode *raw_inode = ext4_raw_inode(&is.iloc);
> memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
> @@ -1013,9 +1017,6 @@ ext4_xattr_set_handle(handle_t *handle, struct
> inode *inode, int name_index,
> if (flags & XATTR_CREATE)
> goto cleanup;
> }
> - error = ext4_journal_get_write_access(handle, is.iloc.bh);
> - if (error)
> - goto cleanup;
> if (!value) {
> if (!is.s.not_found)
> error = ext4_xattr_ibody_set(handle, inode, &i, &is);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


2009-11-07 00:18:42

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: journal all modifications in ext4_xattr_set_handle

On 2009-11-06, at 15:29, Eric Sandeen wrote:
> ext4_xattr_set_handle() was modifying s_inode_size outside
> of journaling constraints; this is one of the accesses that
> was causing the crc errors in journal replay as seen in
> kernel.org bugzilla #14354.

Is this description accurate? It doesn't seem to be modifying
s_inode_size, per-se, but rather zeroing the whole inode if it is a
new inode that was never read from disk.

Other than the above description the patch looks correct.

Reviewed-by: Andreas Dilger <[email protected]>

> @@ -988,6 +988,10 @@ ext4_xattr_set_handle(handle_t *handle, struct
> inode *inode, int name_index,
> if (error)
> goto cleanup;
> + error = ext4_journal_get_write_access(handle, is.iloc.bh);
> + if (error)
> + goto cleanup;
> +
> if (EXT4_I(inode)->i_state & EXT4_STATE_NEW) {
> struct ext4_inode *raw_inode = ext4_raw_inode(&is.iloc);
> memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
> @@ -1013,9 +1017,6 @@ ext4_xattr_set_handle(handle_t *handle, struct
> inode *inode, int name_index,
> if (flags & XATTR_CREATE)
> goto cleanup;
> }
> - error = ext4_journal_get_write_access(handle, is.iloc.bh);
> - if (error)
> - goto cleanup;
> if (!value) {
> if (!is.s.not_found)
> error = ext4_xattr_ibody_set(handle, inode, &i, &is);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-
> ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2009-11-07 00:22:48

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: journal all modifications in ext4_xattr_set_handle

On 2009-11-06, at 15:35, Eric Sandeen wrote:
> Eric Sandeen wrote:
>> ext4_xattr_set_handle() was modifying s_inode_size outside
>> of journaling constraints; this is one of the accesses that
>> was causing the crc errors in journal replay as seen in
>> kernel.org bugzilla #14354.
>
> Oh, and for those who haven't been following the bug, big
> thanks to Chris Mason for helping to look into this and coming
> up with the debugging patch that made it obvious...

It would be great, IMHO, to have this debugging patch submitted to
the kernel also, and enabled under a CONFIG option.

Having a description of the side effects (i.e. page fault when a
read-only page is accessed) in the Kconfig description would be
needed, but at least if it is in the code it can be used by anyone
trying to track down the problem, rather than the perennial "where
are AKPM's buffer-head tracing patches, and how much work needs to
be done to update them for the current kernel".

I'd also be interested to see the "write shadow buffer to journal"
one-line patch that was discussed in the bug.

>> Signed-off-by: Eric Sandeen <[email protected]>
>> ---
>> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
>> index fed5b01..0257019 100644
>> --- a/fs/ext4/xattr.c
>> +++ b/fs/ext4/xattr.c
>> @@ -988,6 +988,10 @@ ext4_xattr_set_handle(handle_t *handle, struct
>> inode *inode, int name_index,
>> if (error)
>> goto cleanup;
>> + error = ext4_journal_get_write_access(handle, is.iloc.bh);
>> + if (error)
>> + goto cleanup;
>> +
>> if (EXT4_I(inode)->i_state & EXT4_STATE_NEW) {
>> struct ext4_inode *raw_inode = ext4_raw_inode(&is.iloc);
>> memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
>> @@ -1013,9 +1017,6 @@ ext4_xattr_set_handle(handle_t *handle,
>> struct inode *inode, int name_index,
>> if (flags & XATTR_CREATE)
>> goto cleanup;
>> }
>> - error = ext4_journal_get_write_access(handle, is.iloc.bh);
>> - if (error)
>> - goto cleanup;
>> if (!value) {
>> if (!is.s.not_found)
>> error = ext4_xattr_ibody_set(handle, inode, &i, &is);
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-
>> ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-
> ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2009-11-07 01:00:48

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: journal all modifications in ext4_xattr_set_handle

Andreas Dilger wrote:
> On 2009-11-06, at 15:29, Eric Sandeen wrote:
>> ext4_xattr_set_handle() was modifying s_inode_size outside
>> of journaling constraints; this is one of the accesses that
>> was causing the crc errors in journal replay as seen in
>> kernel.org bugzilla #14354.
>
> Is this description accurate? It doesn't seem to be modifying
> s_inode_size, per-se, but rather zeroing the whole inode if it is a new
> inode that was never read from disk.

Doh, skimmed too fast, you're right. I'll resend a V2 w/ the proper
description, sorry about that. :)

-eric

> Other than the above description the patch looks correct.
>
> Reviewed-by: Andreas Dilger <[email protected]>
>
>> @@ -988,6 +988,10 @@ ext4_xattr_set_handle(handle_t *handle, struct
>> inode *inode, int name_index,
>> if (error)
>> goto cleanup;
>> + error = ext4_journal_get_write_access(handle, is.iloc.bh);
>> + if (error)
>> + goto cleanup;
>> +
>> if (EXT4_I(inode)->i_state & EXT4_STATE_NEW) {
>> struct ext4_inode *raw_inode = ext4_raw_inode(&is.iloc);
>> memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
>> @@ -1013,9 +1017,6 @@ ext4_xattr_set_handle(handle_t *handle, struct
>> inode *inode, int name_index,
>> if (flags & XATTR_CREATE)
>> goto cleanup;
>> }
>> - error = ext4_journal_get_write_access(handle, is.iloc.bh);
>> - if (error)
>> - goto cleanup;
>> if (!value) {
>> if (!is.s.not_found)
>> error = ext4_xattr_ibody_set(handle, inode, &i, &is);
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>


2009-11-07 01:03:15

by Eric Sandeen

[permalink] [raw]
Subject: [PATCH 1/2 V2] ext4: journal all modifications in ext4_xattr_set_handle

ext4_xattr_set_handle() was zeroing out an inode outside
of journaling constraints; this is one of the accesses that
was causing the crc errors in journal replay as seen in
kernel.org bugzilla #14354.


Signed-off-by: Eric Sandeen <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
---

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index fed5b01..0257019 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -988,6 +988,10 @@ ext4_xattr_set_handle(handle_t *handle, struct
inode *inode, int name_index,
if (error)
goto cleanup;

+ error = ext4_journal_get_write_access(handle, is.iloc.bh);
+ if (error)
+ goto cleanup;
+
if (EXT4_I(inode)->i_state & EXT4_STATE_NEW) {
struct ext4_inode *raw_inode = ext4_raw_inode(&is.iloc);
memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
@@ -1013,9 +1017,6 @@ ext4_xattr_set_handle(handle_t *handle, struct
inode *inode, int name_index,
if (flags & XATTR_CREATE)
goto cleanup;
}
- error = ext4_journal_get_write_access(handle, is.iloc.bh);
- if (error)
- goto cleanup;
if (!value) {
if (!is.s.not_found)
error = ext4_xattr_ibody_set(handle, inode, &i, &is);

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html

2009-11-07 02:33:39

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: journal all modifications in ext4_xattr_set_handle

On Fri, Nov 06, 2009 at 05:22:51PM -0700, Andreas Dilger wrote:
> On 2009-11-06, at 15:35, Eric Sandeen wrote:
>> Eric Sandeen wrote:
>>> ext4_xattr_set_handle() was modifying s_inode_size outside
>>> of journaling constraints; this is one of the accesses that
>>> was causing the crc errors in journal replay as seen in
>>> kernel.org bugzilla #14354.
>>
>> Oh, and for those who haven't been following the bug, big
>> thanks to Chris Mason for helping to look into this and coming
>> up with the debugging patch that made it obvious...
>
> It would be great, IMHO, to have this debugging patch submitted to
> the kernel also, and enabled under a CONFIG option.

The debugging patch is only safe when the file system's block size ==
page size. I suppose we could leave a warning in Kconfig that the
debugging option is only safe to be used when all of the file systems
in the system have a 4k (or whatever the page size happens to be)
block size, but it's probably better to clean up the patch so that
it's only enabled on a mount option, and then only if the file system
block size is the same as the page size.

> I'd also be interested to see the "write shadow buffer to journal"
> one-line patch that was discussed in the bug.

Which patch are you referring to?

- Ted

2009-11-07 19:22:10

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: journal all modifications in ext4_xattr_set_handle

On 2009-11-06, at 19:33, Theodore Tso wrote:
> On Fri, Nov 06, 2009 at 05:22:51PM -0700, Andreas Dilger wrote:
>> On 2009-11-06, at 15:35, Eric Sandeen wrote:
>>> Oh, and for those who haven't been following the bug, big
>>> thanks to Chris Mason for helping to look into this and coming
>>> up with the debugging patch that made it obvious...
>>
>> It would be great, IMHO, to have this debugging patch submitted to
>> the kernel also, and enabled under a CONFIG option.
>
> The debugging patch is only safe when the file system's block size ==
> page size. I suppose we could leave a warning in Kconfig that the
> debugging option is only safe to be used when all of the file systems
> in the system have a 4k (or whatever the page size happens to be)
> block size, but it's probably better to clean up the patch so that
> it's only enabled on a mount option, and then only if the file system
> block size is the same as the page size.

Yes, definitely, but this covers about 90% of the filesystems in use,
so I don't see it as a big limitation.

>> I'd also be interested to see the "write shadow buffer to journal"
>> one-line patch that was discussed in the bug.
>
> Which patch are you referring to?


In bugzilla there was a comment from Chris:

"Eric helped prove this is the case by testing a patch that
always
crcs and writes a stable duplicate of the metadata buffer
instead
of the metadata buffer itself (a one liner since the log
code was
already setup for this)."

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2009-11-07 19:31:02

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH 1/2] ext4: journal all modifications in ext4_xattr_set_handle

Andreas Dilger wrote:
> On 2009-11-06, at 19:33, Theodore Tso wrote:

...

>>> I'd also be interested to see the "write shadow buffer to journal"
>>> one-line patch that was discussed in the bug.
>>
>> Which patch are you referring to?
>
>
> In bugzilla there was a comment from Chris:
>
> "Eric helped prove this is the case by testing a patch that always
> crcs and writes a stable duplicate of the metadata buffer instead
> of the metadata buffer itself (a one liner since the log code was
> already setup for this)."
>

Like this:

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index b0ab521..75dd58f 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -353,7 +353,7 @@ repeat:
/*
* Do we need to do a data copy?
*/
- if (need_copy_out && !done_copy_out) {
+ if (!done_copy_out) {
char *tmp;

jbd_unlock_bh_state(bh_in);