Hi,
I was working on a reiserfs panic with 2.6.17-rc3, while running fs
stress tests.
The panic message looked like :
" REISERFS: panic (device Null superblock): reiserfs[4248]: assertion
!(truncate && (REISERFS_I(inode)->i_flags & i_link_saved_truncate_mask)
) failed at fs/reiserfs/super.c:328:add_save_link: saved link already re
exists for truncated inode 13b5a "
------ Summary of the problem -----------
Reiserfs uses "safe links" ( directory entries with some special key
value) to keep track of "truncated" or "unlinked" files to ensure
integrity across crashes.
Whenever there is a truncate/unlink on a file, Reiserfs creates a safe
link for the same and deletes the same once the operation is complete.
If the machine crashes before committing the operation, whenever the fs
is mounted next time, the fs will look for the saved links ( easy to
find out, since they have special key) and commit the operation that was
unfinished.
The problem here occurs as follows:
Whenever there is an extending DIO write operation, the fs would
create a safe link so as to ensure the file size consistent, if there is
crash in between the DIO. This will be deleted once the write operation
finishes.
If the DIO write happens to go through a "HOLE" region in the file, it
will fall into normal "buffered write", which is done through the
address space operations prepare_write() & commit_write(). Now, the
prepare_write() might allocate blocks for the file (if needed). So if
there is some error at a later point (say ENOSPC) in prepare_write(), we
need to discard the allocated blocks. This is done by calling
"vmtruncate()" on the file. This call leads to reiserfs specific
truncate, which would try to add a save link for the file.
This addition causes a reiserfs_panic, since there is already a "save
link" stored for the file.
Any thoughts on how to fix this ?
thanks,
Suzuki K P
Linux Technology Centre,
IBM Software Labs.
On Fri, 05 May 2006 10:22:21 +0530 Suzuki wrote:
> Hi,
>
>
> I was working on a reiserfs panic with 2.6.17-rc3, while running fs
> stress tests.
Hi,
What test(s) do you use?
> The panic message looked like :
>
> " REISERFS: panic (device Null superblock): reiserfs[4248]: assertion
> !(truncate && (REISERFS_I(inode)->i_flags & i_link_saved_truncate_mask)
> ) failed at fs/reiserfs/super.c:328:add_save_link: saved link already re
> exists for truncated inode 13b5a "
Thanks,
---
~Randy
Randy.Dunlap wrote:
> On Fri, 05 May 2006 10:22:21 +0530 Suzuki wrote:
>
>
>>Hi,
>>
>>
>>I was working on a reiserfs panic with 2.6.17-rc3, while running fs
>>stress tests.
>
>
> Hi,
> What test(s) do you use?
The problem was initially hit while running the following tests
simultaneously..
IOZone, bonnie++, dbench, fs_inod, fs_maim, fsstress, fsx_linux,
postmark, tiobench.
As I had mentioned in my post, I have a simple testcase to trigger the
panic which can hit the code path explained below.
The root cause of the problem is (as mentioned in the earlier post):
Whenever there is an extending DIO write operation, the fs would
create a safe link so as to ensure the file size consistent, if there is
crash in between the DIO. This will be deleted once the write operation
finishes.
If the DIO write happens to go through a "HOLE" region in the file, it
will fall into normal "buffered write", which is done through the
address space operations prepare_write() & commit_write(). Now, the
prepare_write() might allocate blocks for the file (if needed). So if
there is some error at a later point (say ENOSPC) in prepare_write(), we
need to discard the allocated blocks. This is done by calling
"vmtruncate()" on the file. This call leads to reiserfs specific
truncate, which would try to add a save link for the file.
This addition causes a reiserfs_panic, since there is already a "save
link" stored for the file.
Thanks
Suzuki
>
>
>>The panic message looked like :
>>
>>" REISERFS: panic (device Null superblock): reiserfs[4248]: assertion
>>!(truncate && (REISERFS_I(inode)->i_flags & i_link_saved_truncate_mask)
>>) failed at fs/reiserfs/super.c:328:add_save_link: saved link already re
>>exists for truncated inode 13b5a "
>
>
> Thanks,
> ---
> ~Randy
Hello
On Fri, 2006-05-12 at 12:15 +0530, Suzuki wrote:
> Randy.Dunlap wrote:
> > On Fri, 05 May 2006 10:22:21 +0530 Suzuki wrote:
> >
> >
> >>Hi,
> >>
> >>
> >>I was working on a reiserfs panic with 2.6.17-rc3, while running fs
> >>stress tests.
> >
> >
> > Hi,
> > What test(s) do you use?
>
> The problem was initially hit while running the following tests
> simultaneously..
> IOZone, bonnie++, dbench, fs_inod, fs_maim, fsstress, fsx_linux,
> postmark, tiobench.
>
> As I had mentioned in my post, I have a simple testcase to trigger the
> panic which can hit the code path explained below.
>
> The root cause of the problem is (as mentioned in the earlier post):
>
> Whenever there is an extending DIO write operation, the fs would
> create a safe link so as to ensure the file size consistent, if there is
> crash in between the DIO. This will be deleted once the write operation
> finishes.
>
I am not sure why safe link is needed for write. Maybe one who added
that still remembers why that was done and can explain, please?
Suzuki, would you please try the attached patch?
Vladimir V. Saveliev wrote:
> Hello
>
> On Fri, 2006-05-12 at 12:15 +0530, Suzuki wrote:
>
>>Randy.Dunlap wrote:
>>
>>>On Fri, 05 May 2006 10:22:21 +0530 Suzuki wrote:
>>>
>>>
>>>
>>>>Hi,
>>>>
>>>>
>>>>I was working on a reiserfs panic with 2.6.17-rc3, while running fs
>>>>stress tests.
>>>
>>>
>>>Hi,
>>>What test(s) do you use?
>>
>>The problem was initially hit while running the following tests
>>simultaneously..
>>IOZone, bonnie++, dbench, fs_inod, fs_maim, fsstress, fsx_linux,
>>postmark, tiobench.
>>
>>As I had mentioned in my post, I have a simple testcase to trigger the
>>panic which can hit the code path explained below.
>>
>>The root cause of the problem is (as mentioned in the earlier post):
>>
>> Whenever there is an extending DIO write operation, the fs would
>>create a safe link so as to ensure the file size consistent, if there is
>>crash in between the DIO. This will be deleted once the write operation
>>finishes.
>>
>
>
> I am not sure why safe link is needed for write. Maybe one who added
> that still remembers why that was done and can explain, please?
>
> Suzuki, would you please try the attached patch?
We had no luck in reproducing the issue until recently. This was again
hit with 2.6.18-rc7. The patch (with appropriate changes to match the
current) has been verified to resolve the issue. Attached here is the
patch which fits the current level.
Thanks,
Suzuki
>
>
>
>
> ------------------------------------------------------------------------
>
>
> diff -puN fs/reiserfs/file.c~reiserfs-dont-use-safelink-on-write fs/reiserfs/file.c
> --- linux-2.6.17-rc3/fs/reiserfs/file.c~reiserfs-dont-use-safelink-on-write 2006-05-12 11:28:16.000000000 +0400
> +++ linux-2.6.17-rc3-vs/fs/reiserfs/file.c 2006-05-12 11:29:16.000000000 +0400
> @@ -1305,54 +1305,7 @@ static ssize_t reiserfs_file_write(struc
> }
>
> if (file->f_flags & O_DIRECT) { // Direct IO needs treatment
> - ssize_t result, after_file_end = 0;
> - if ((*ppos + count >= inode->i_size)
> - || (file->f_flags & O_APPEND)) {
> - /* If we are appending a file, we need to put this savelink in here.
> - If we will crash while doing direct io, finish_unfinished will
> - cut the garbage from the file end. */
> - reiserfs_write_lock(inode->i_sb);
> - err =
> - journal_begin(&th, inode->i_sb,
> - JOURNAL_PER_BALANCE_CNT);
> - if (err) {
> - reiserfs_write_unlock(inode->i_sb);
> - return err;
> - }
> - reiserfs_update_inode_transaction(inode);
> - add_save_link(&th, inode, 1 /* Truncate */ );
> - after_file_end = 1;
> - err =
> - journal_end(&th, inode->i_sb,
> - JOURNAL_PER_BALANCE_CNT);
> - reiserfs_write_unlock(inode->i_sb);
> - if (err)
> - return err;
> - }
> - result = generic_file_write(file, buf, count, ppos);
> -
> - if (after_file_end) { /* Now update i_size and remove the savelink */
> - struct reiserfs_transaction_handle th;
> - reiserfs_write_lock(inode->i_sb);
> - err = journal_begin(&th, inode->i_sb, 1);
> - if (err) {
> - reiserfs_write_unlock(inode->i_sb);
> - return err;
> - }
> - reiserfs_update_inode_transaction(inode);
> - mark_inode_dirty(inode);
> - err = journal_end(&th, inode->i_sb, 1);
> - if (err) {
> - reiserfs_write_unlock(inode->i_sb);
> - return err;
> - }
> - err = remove_save_link(inode, 1 /* truncate */ );
> - reiserfs_write_unlock(inode->i_sb);
> - if (err)
> - return err;
> - }
> -
> - return result;
> + return generic_file_write(file, buf, count, ppos);
> }
>
> if (unlikely((ssize_t) count < 0))
>
> _