[1] https://www.mail-archive.com/[email protected]/msg15126.html
As [1] reported, if lower device doesn't support write barrier, in below
case:
- write page #0; persist
- overwrite page #0
- fsync
- write data page #0 OPU into device's cache
- write inode page into device's cache
- issue flush
If SPO is triggered during flush command, inode page can be persisted
before data page #0, so that after recovery, inode page can be recovered
with new physical block address of data page #0, however there may
contains dummy data in new physical block address.
Then what user will see is: after overwrite & fsync + SPO, old data in
file was corrupted, if any user do care about such case, we can suggest
user to use STRICT fsync mode, in this mode, we will force to use atomic
write sematics to keep write order in between data/node and last node,
so that it avoids potential data corruption during fsync().
Signed-off-by: Chao Yu <[email protected]>
---
fs/f2fs/file.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 6afd4562335f..00b45876eaa1 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -301,6 +301,18 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
f2fs_exist_written_data(sbi, ino, UPDATE_INO))
goto flush_out;
goto out;
+ } else {
+ /*
+ * for OPU case, during fsync(), node can be persisted before
+ * data when lower device doesn't support write barrier, result
+ * in data corruption after SPO.
+ * So for strict fsync mode, force to use atomic write sematics
+ * to keep write order in between data/node and last node to
+ * avoid potential data corruption.
+ */
+ if (F2FS_OPTION(sbi).fsync_mode ==
+ FSYNC_MODE_STRICT && !atomic)
+ atomic = true;
}
go_write:
/*
--
2.22.1
On 2021/7/20 9:15, Jaegeuk Kim wrote:
> Wasn't it supposed to be v1?
I skip IPU case for v1, and resend it as v3, is it fine to you?
Thanks,
>
> On 07/20, Chao Yu wrote:
>> [1] https://www.mail-archive.com/[email protected]/msg15126.html
>>
>> As [1] reported, if lower device doesn't support write barrier, in below
>> case:
>>
>> - write page #0; persist
>> - overwrite page #0
>> - fsync
>> - write data page #0 OPU into device's cache
>> - write inode page into device's cache
>> - issue flush
>>
>> If SPO is triggered during flush command, inode page can be persisted
>> before data page #0, so that after recovery, inode page can be recovered
>> with new physical block address of data page #0, however there may
>> contains dummy data in new physical block address.
>>
>> Then what user will see is: after overwrite & fsync + SPO, old data in
>> file was corrupted, if any user do care about such case, we can suggest
>> user to use STRICT fsync mode, in this mode, we will force to use atomic
>> write sematics to keep write order in between data/node and last node,
>> so that it avoids potential data corruption during fsync().
>>
>> Signed-off-by: Chao Yu <[email protected]>
>> ---
>> fs/f2fs/file.c | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>> index 6afd4562335f..00b45876eaa1 100644
>> --- a/fs/f2fs/file.c
>> +++ b/fs/f2fs/file.c
>> @@ -301,6 +301,18 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
>> f2fs_exist_written_data(sbi, ino, UPDATE_INO))
>> goto flush_out;
>> goto out;
>> + } else {
>> + /*
>> + * for OPU case, during fsync(), node can be persisted before
>> + * data when lower device doesn't support write barrier, result
>> + * in data corruption after SPO.
>> + * So for strict fsync mode, force to use atomic write sematics
>> + * to keep write order in between data/node and last node to
>> + * avoid potential data corruption.
>> + */
>> + if (F2FS_OPTION(sbi).fsync_mode ==
>> + FSYNC_MODE_STRICT && !atomic)
>> + atomic = true;
>> }
>> go_write:
>> /*
>> --
>> 2.22.1
Wasn't it supposed to be v1?
On 07/20, Chao Yu wrote:
> [1] https://www.mail-archive.com/[email protected]/msg15126.html
>
> As [1] reported, if lower device doesn't support write barrier, in below
> case:
>
> - write page #0; persist
> - overwrite page #0
> - fsync
> - write data page #0 OPU into device's cache
> - write inode page into device's cache
> - issue flush
>
> If SPO is triggered during flush command, inode page can be persisted
> before data page #0, so that after recovery, inode page can be recovered
> with new physical block address of data page #0, however there may
> contains dummy data in new physical block address.
>
> Then what user will see is: after overwrite & fsync + SPO, old data in
> file was corrupted, if any user do care about such case, we can suggest
> user to use STRICT fsync mode, in this mode, we will force to use atomic
> write sematics to keep write order in between data/node and last node,
> so that it avoids potential data corruption during fsync().
>
> Signed-off-by: Chao Yu <[email protected]>
> ---
> fs/f2fs/file.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 6afd4562335f..00b45876eaa1 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -301,6 +301,18 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
> f2fs_exist_written_data(sbi, ino, UPDATE_INO))
> goto flush_out;
> goto out;
> + } else {
> + /*
> + * for OPU case, during fsync(), node can be persisted before
> + * data when lower device doesn't support write barrier, result
> + * in data corruption after SPO.
> + * So for strict fsync mode, force to use atomic write sematics
> + * to keep write order in between data/node and last node to
> + * avoid potential data corruption.
> + */
> + if (F2FS_OPTION(sbi).fsync_mode ==
> + FSYNC_MODE_STRICT && !atomic)
> + atomic = true;
> }
> go_write:
> /*
> --
> 2.22.1
Ping,
On 2021/7/20 9:19, Chao Yu wrote:
> On 2021/7/20 9:15, Jaegeuk Kim wrote:
>> Wasn't it supposed to be v1?
>
> I skip IPU case for v1, and resend it as v3, is it fine to you?
>
> Thanks,
>
>>
>> On 07/20, Chao Yu wrote:
>>> [1] https://www.mail-archive.com/[email protected]/msg15126.html
>>>
>>> As [1] reported, if lower device doesn't support write barrier, in below
>>> case:
>>>
>>> - write page #0; persist
>>> - overwrite page #0
>>> - fsync
>>> - write data page #0 OPU into device's cache
>>> - write inode page into device's cache
>>> - issue flush
>>>
>>> If SPO is triggered during flush command, inode page can be persisted
>>> before data page #0, so that after recovery, inode page can be recovered
>>> with new physical block address of data page #0, however there may
>>> contains dummy data in new physical block address.
>>>
>>> Then what user will see is: after overwrite & fsync + SPO, old data in
>>> file was corrupted, if any user do care about such case, we can suggest
>>> user to use STRICT fsync mode, in this mode, we will force to use atomic
>>> write sematics to keep write order in between data/node and last node,
>>> so that it avoids potential data corruption during fsync().
>>>
>>> Signed-off-by: Chao Yu <[email protected]>
>>> ---
>>> fs/f2fs/file.c | 12 ++++++++++++
>>> 1 file changed, 12 insertions(+)
>>>
>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>> index 6afd4562335f..00b45876eaa1 100644
>>> --- a/fs/f2fs/file.c
>>> +++ b/fs/f2fs/file.c
>>> @@ -301,6 +301,18 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
>>> f2fs_exist_written_data(sbi, ino, UPDATE_INO))
>>> goto flush_out;
>>> goto out;
>>> + } else {
>>> + /*
>>> + * for OPU case, during fsync(), node can be persisted before
>>> + * data when lower device doesn't support write barrier, result
>>> + * in data corruption after SPO.
>>> + * So for strict fsync mode, force to use atomic write sematics
>>> + * to keep write order in between data/node and last node to
>>> + * avoid potential data corruption.
>>> + */
>>> + if (F2FS_OPTION(sbi).fsync_mode ==
>>> + FSYNC_MODE_STRICT && !atomic)
>>> + atomic = true;
>>> }
>>> go_write:
>>> /*
>>> --
>>> 2.22.1
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>
On 07/29, Chao Yu wrote:
> Ping,
Added. Thanks.
>
> On 2021/7/20 9:19, Chao Yu wrote:
> > On 2021/7/20 9:15, Jaegeuk Kim wrote:
> > > Wasn't it supposed to be v1?
> >
> > I skip IPU case for v1, and resend it as v3, is it fine to you?
> >
> > Thanks,
> >
> > >
> > > On 07/20, Chao Yu wrote:
> > > > [1] https://www.mail-archive.com/[email protected]/msg15126.html
> > > >
> > > > As [1] reported, if lower device doesn't support write barrier, in below
> > > > case:
> > > >
> > > > - write page #0; persist
> > > > - overwrite page #0
> > > > - fsync
> > > > - write data page #0 OPU into device's cache
> > > > - write inode page into device's cache
> > > > - issue flush
> > > >
> > > > If SPO is triggered during flush command, inode page can be persisted
> > > > before data page #0, so that after recovery, inode page can be recovered
> > > > with new physical block address of data page #0, however there may
> > > > contains dummy data in new physical block address.
> > > >
> > > > Then what user will see is: after overwrite & fsync + SPO, old data in
> > > > file was corrupted, if any user do care about such case, we can suggest
> > > > user to use STRICT fsync mode, in this mode, we will force to use atomic
> > > > write sematics to keep write order in between data/node and last node,
> > > > so that it avoids potential data corruption during fsync().
> > > >
> > > > Signed-off-by: Chao Yu <[email protected]>
> > > > ---
> > > > fs/f2fs/file.c | 12 ++++++++++++
> > > > 1 file changed, 12 insertions(+)
> > > >
> > > > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > > > index 6afd4562335f..00b45876eaa1 100644
> > > > --- a/fs/f2fs/file.c
> > > > +++ b/fs/f2fs/file.c
> > > > @@ -301,6 +301,18 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
> > > > f2fs_exist_written_data(sbi, ino, UPDATE_INO))
> > > > goto flush_out;
> > > > goto out;
> > > > + } else {
> > > > + /*
> > > > + * for OPU case, during fsync(), node can be persisted before
> > > > + * data when lower device doesn't support write barrier, result
> > > > + * in data corruption after SPO.
> > > > + * So for strict fsync mode, force to use atomic write sematics
> > > > + * to keep write order in between data/node and last node to
> > > > + * avoid potential data corruption.
> > > > + */
> > > > + if (F2FS_OPTION(sbi).fsync_mode ==
> > > > + FSYNC_MODE_STRICT && !atomic)
> > > > + atomic = true;
> > > > }
> > > > go_write:
> > > > /*
> > > > --
> > > > 2.22.1
> >
> >
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >