2019-11-26 08:31:41

by Damien Le Moal

[permalink] [raw]
Subject: [PATCH] f2fs: Fix direct IO handling

f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
flag for a kiocb structure. However, the file system direct IO handler
function f2fs_direct_IO() may have decided that a direct IO has to be
exececuted as a buffered IO using the function f2fs_force_buffered_io().
This is the case for instance for volumes including zoned block device
and for unaligned write IOs with LFS mode enabled.

These 2 different methods of identifying direct IOs can result in
inconsistencies generating stale data access for direct reads after a
direct IO write that is treated as a buffered write. Fix this
inconsistency by combining the IOCB_DIRECT flag test with the result
of f2fs_force_buffered_io().

Reported-by: Javier Gonzalez <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
---
fs/f2fs/data.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 5755e897a5f0..8ac2d3b70022 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
int flag;
int err = 0;
bool direct_io = iocb->ki_flags & IOCB_DIRECT;
+ bool do_direct_io = direct_io &&
+ !f2fs_force_buffered_io(inode, iocb, from);

/* convert inline data for Direct I/O*/
if (direct_io) {
@@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
return err;
}

- if (direct_io && allow_outplace_dio(inode, iocb, from))
+ if (do_direct_io && allow_outplace_dio(inode, iocb, from))
return 0;

if (is_inode_flag_set(inode, FI_NO_PREALLOC))
--
2.23.0


2019-11-26 08:39:53

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling

Hello Damien,

IIUC, you are trying to fix a stale data read by DIO read for the case
you explained in your patch w.r.t. DIO-write forced to write as buffIO.

Coincidentally I was just looking at the same code path just now.
So I do have a query to you/f2fs group. Below could be silly one, as I
don't understand F2FS in great detail.

How is the stale data by DIO read, is protected against a mmap
writes via f2fs_vm_page_mkwrite?

f2fs_vm_page_mkwrite() f2fs_direct_IO (read)
filemap_write_and_wait_range()
-> f2fs_get_blocks()
-> submit_bio()

-> set_page_dirty()

Is above race possible with current f2fs code?
i.e. f2fs_direct_IO could read the stale data from the blocks
which were allocated due to mmap fault?

Am I missing something here?

-ritesh

On 11/26/19 1:27 PM, Damien Le Moal wrote:
> f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
> flag for a kiocb structure. However, the file system direct IO handler
> function f2fs_direct_IO() may have decided that a direct IO has to be
> exececuted as a buffered IO using the function f2fs_force_buffered_io().
> This is the case for instance for volumes including zoned block device
> and for unaligned write IOs with LFS mode enabled.
>
> These 2 different methods of identifying direct IOs can result in
> inconsistencies generating stale data access for direct reads after a
> direct IO write that is treated as a buffered write. Fix this
> inconsistency by combining the IOCB_DIRECT flag test with the result
> of f2fs_force_buffered_io().
>
> Reported-by: Javier Gonzalez <[email protected]>
> Signed-off-by: Damien Le Moal <[email protected]>
> ---
> fs/f2fs/data.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 5755e897a5f0..8ac2d3b70022 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> int flag;
> int err = 0;
> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
> + bool do_direct_io = direct_io &&
> + !f2fs_force_buffered_io(inode, iocb, from);
>
> /* convert inline data for Direct I/O*/
> if (direct_io) {
> @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> return err;
> }
>
> - if (direct_io && allow_outplace_dio(inode, iocb, from))
> + if (do_direct_io && allow_outplace_dio(inode, iocb, from))
> return 0;
>
> if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>

2019-11-26 16:37:10

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling

On Tue, Nov 26, 2019 at 04:57:19PM +0900, Damien Le Moal wrote:
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 5755e897a5f0..8ac2d3b70022 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> int flag;
> int err = 0;
> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
> + bool do_direct_io = direct_io &&
> + !f2fs_force_buffered_io(inode, iocb, from);

I don't think this is the right fix. The proper fix is to clear
IOCB_DIRECT when falling back to buffered I/O, preferably in the
filemap.c helpers as well.

2019-11-26 23:46:37

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling

On 11/26, Damien Le Moal wrote:
> f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
> flag for a kiocb structure. However, the file system direct IO handler
> function f2fs_direct_IO() may have decided that a direct IO has to be
> exececuted as a buffered IO using the function f2fs_force_buffered_io().
> This is the case for instance for volumes including zoned block device
> and for unaligned write IOs with LFS mode enabled.
>
> These 2 different methods of identifying direct IOs can result in
> inconsistencies generating stale data access for direct reads after a
> direct IO write that is treated as a buffered write. Fix this
> inconsistency by combining the IOCB_DIRECT flag test with the result
> of f2fs_force_buffered_io().
>
> Reported-by: Javier Gonzalez <[email protected]>
> Signed-off-by: Damien Le Moal <[email protected]>
> ---
> fs/f2fs/data.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 5755e897a5f0..8ac2d3b70022 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> int flag;
> int err = 0;
> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
> + bool do_direct_io = direct_io &&
> + !f2fs_force_buffered_io(inode, iocb, from);
>
> /* convert inline data for Direct I/O*/
> if (direct_io) {
> @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> return err;
> }
>
> - if (direct_io && allow_outplace_dio(inode, iocb, from))
> + if (do_direct_io && allow_outplace_dio(inode, iocb, from))

It seems f2fs_force_buffered_io() includes allow_outplace_dio().

How about this?
---
fs/f2fs/data.c | 13 -------------
fs/f2fs/file.c | 35 +++++++++++++++++++++++++----------
2 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index a034cd0ce021..fc40a72f7827 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1180,19 +1180,6 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
int err = 0;
bool direct_io = iocb->ki_flags & IOCB_DIRECT;

- /* convert inline data for Direct I/O*/
- if (direct_io) {
- err = f2fs_convert_inline_inode(inode);
- if (err)
- return err;
- }
-
- if (direct_io && allow_outplace_dio(inode, iocb, from))
- return 0;
-
- if (is_inode_flag_set(inode, FI_NO_PREALLOC))
- return 0;
-
map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
if (map.m_len > map.m_lblk)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index c0560d62dbee..6b32ac6c3382 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3386,18 +3386,33 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
ret = -EAGAIN;
goto out;
}
- } else {
- preallocated = true;
- target_size = iocb->ki_pos + iov_iter_count(from);
+ goto write;
+ }

- err = f2fs_preallocate_blocks(iocb, from);
- if (err) {
- clear_inode_flag(inode, FI_NO_PREALLOC);
- inode_unlock(inode);
- ret = err;
- goto out;
- }
+ if (is_inode_flag_set(inode, FI_NO_PREALLOC))
+ goto write;
+
+ if (iocb->ki_flags & IOCB_DIRECT) {
+ /* convert inline data for Direct I/O*/
+ err = f2fs_convert_inline_inode(inode);
+ if (err)
+ goto out_err;
+
+ if (!f2fs_force_buffered_io(inode, iocb, from))
+ goto write;
+ }
+ preallocated = true;
+ target_size = iocb->ki_pos + iov_iter_count(from);
+
+ err = f2fs_preallocate_blocks(iocb, from);
+ if (err) {
+out_err:
+ clear_inode_flag(inode, FI_NO_PREALLOC);
+ inode_unlock(inode);
+ ret = err;
+ goto out;
}
+write:
ret = __generic_file_write_iter(iocb, from);
clear_inode_flag(inode, FI_NO_PREALLOC);

--
2.19.0.605.g01d371f741-goog

2019-11-28 02:55:33

by Damien Le Moal

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling

On 2019/11/26 17:34, Ritesh Harjani wrote:
> Hello Damien,
>
> IIUC, you are trying to fix a stale data read by DIO read for the case
> you explained in your patch w.r.t. DIO-write forced to write as buffIO.
>
> Coincidentally I was just looking at the same code path just now.
> So I do have a query to you/f2fs group. Below could be silly one, as I
> don't understand F2FS in great detail.
>
> How is the stale data by DIO read, is protected against a mmap
> writes via f2fs_vm_page_mkwrite?
>
> f2fs_vm_page_mkwrite() f2fs_direct_IO (read)
> filemap_write_and_wait_range()
> -> f2fs_get_blocks()
> -> submit_bio()
>
> -> set_page_dirty()
>
> Is above race possible with current f2fs code?
> i.e. f2fs_direct_IO could read the stale data from the blocks
> which were allocated due to mmap fault?

The faulted page is locked until the fault is fully processed so direct
IO has to wait for that to complete first.

>
> Am I missing something here?
>
> -ritesh
>
> On 11/26/19 1:27 PM, Damien Le Moal wrote:
>> f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
>> flag for a kiocb structure. However, the file system direct IO handler
>> function f2fs_direct_IO() may have decided that a direct IO has to be
>> exececuted as a buffered IO using the function f2fs_force_buffered_io().
>> This is the case for instance for volumes including zoned block device
>> and for unaligned write IOs with LFS mode enabled.
>>
>> These 2 different methods of identifying direct IOs can result in
>> inconsistencies generating stale data access for direct reads after a
>> direct IO write that is treated as a buffered write. Fix this
>> inconsistency by combining the IOCB_DIRECT flag test with the result
>> of f2fs_force_buffered_io().
>>
>> Reported-by: Javier Gonzalez <[email protected]>
>> Signed-off-by: Damien Le Moal <[email protected]>
>> ---
>> fs/f2fs/data.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>> index 5755e897a5f0..8ac2d3b70022 100644
>> --- a/fs/f2fs/data.c
>> +++ b/fs/f2fs/data.c
>> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>> int flag;
>> int err = 0;
>> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>> + bool do_direct_io = direct_io &&
>> + !f2fs_force_buffered_io(inode, iocb, from);
>>
>> /* convert inline data for Direct I/O*/
>> if (direct_io) {
>> @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>> return err;
>> }
>>
>> - if (direct_io && allow_outplace_dio(inode, iocb, from))
>> + if (do_direct_io && allow_outplace_dio(inode, iocb, from))
>> return 0;
>>
>> if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>>
>
>


--
Damien Le Moal
Western Digital Research

2019-11-28 03:05:54

by Shinichiro Kawasaki

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling

On Nov 26, 2019 / 15:44, Jaegeuk Kim wrote:
> On 11/26, Damien Le Moal wrote:
> > f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
> > flag for a kiocb structure. However, the file system direct IO handler
> > function f2fs_direct_IO() may have decided that a direct IO has to be
> > exececuted as a buffered IO using the function f2fs_force_buffered_io().
> > This is the case for instance for volumes including zoned block device
> > and for unaligned write IOs with LFS mode enabled.
> >
> > These 2 different methods of identifying direct IOs can result in
> > inconsistencies generating stale data access for direct reads after a
> > direct IO write that is treated as a buffered write. Fix this
> > inconsistency by combining the IOCB_DIRECT flag test with the result
> > of f2fs_force_buffered_io().
> >
> > Reported-by: Javier Gonzalez <[email protected]>
> > Signed-off-by: Damien Le Moal <[email protected]>
> > ---
> > fs/f2fs/data.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index 5755e897a5f0..8ac2d3b70022 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> > int flag;
> > int err = 0;
> > bool direct_io = iocb->ki_flags & IOCB_DIRECT;
> > + bool do_direct_io = direct_io &&
> > + !f2fs_force_buffered_io(inode, iocb, from);
> >
> > /* convert inline data for Direct I/O*/
> > if (direct_io) {
> > @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> > return err;
> > }
> >
> > - if (direct_io && allow_outplace_dio(inode, iocb, from))
> > + if (do_direct_io && allow_outplace_dio(inode, iocb, from))
>
> It seems f2fs_force_buffered_io() includes allow_outplace_dio().
>
> How about this?

Thanks. I confirmed that the issue is gone with your patch.

Tested-by: Shin'ichiro Kawasaki <[email protected]>

> ---
> fs/f2fs/data.c | 13 -------------
> fs/f2fs/file.c | 35 +++++++++++++++++++++++++----------
> 2 files changed, 25 insertions(+), 23 deletions(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index a034cd0ce021..fc40a72f7827 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -1180,19 +1180,6 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> int err = 0;
> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>
> - /* convert inline data for Direct I/O*/
> - if (direct_io) {
> - err = f2fs_convert_inline_inode(inode);
> - if (err)
> - return err;
> - }
> -
> - if (direct_io && allow_outplace_dio(inode, iocb, from))
> - return 0;
> -
> - if (is_inode_flag_set(inode, FI_NO_PREALLOC))
> - return 0;
> -
> map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
> map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
> if (map.m_len > map.m_lblk)
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index c0560d62dbee..6b32ac6c3382 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -3386,18 +3386,33 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
> ret = -EAGAIN;
> goto out;
> }
> - } else {
> - preallocated = true;
> - target_size = iocb->ki_pos + iov_iter_count(from);
> + goto write;
> + }
>
> - err = f2fs_preallocate_blocks(iocb, from);
> - if (err) {
> - clear_inode_flag(inode, FI_NO_PREALLOC);
> - inode_unlock(inode);
> - ret = err;
> - goto out;
> - }
> + if (is_inode_flag_set(inode, FI_NO_PREALLOC))
> + goto write;
> +
> + if (iocb->ki_flags & IOCB_DIRECT) {
> + /* convert inline data for Direct I/O*/
> + err = f2fs_convert_inline_inode(inode);
> + if (err)
> + goto out_err;
> +
> + if (!f2fs_force_buffered_io(inode, iocb, from))
> + goto write;
> + }
> + preallocated = true;
> + target_size = iocb->ki_pos + iov_iter_count(from);
> +
> + err = f2fs_preallocate_blocks(iocb, from);
> + if (err) {
> +out_err:
> + clear_inode_flag(inode, FI_NO_PREALLOC);
> + inode_unlock(inode);
> + ret = err;
> + goto out;
> }
> +write:
> ret = __generic_file_write_iter(iocb, from);
> clear_inode_flag(inode, FI_NO_PREALLOC);
>
> --
> 2.19.0.605.g01d371f741-goog
>

--
Best Regards,
Shin'ichiro Kawasaki

2019-11-28 10:23:33

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling



On 11/28/19 7:40 AM, Damien Le Moal wrote:
> On 2019/11/26 17:34, Ritesh Harjani wrote:
>> Hello Damien,
>>
>> IIUC, you are trying to fix a stale data read by DIO read for the case
>> you explained in your patch w.r.t. DIO-write forced to write as buffIO.
>>
>> Coincidentally I was just looking at the same code path just now.
>> So I do have a query to you/f2fs group. Below could be silly one, as I
>> don't understand F2FS in great detail.
>>
>> How is the stale data by DIO read, is protected against a mmap
>> writes via f2fs_vm_page_mkwrite?
>>
>> f2fs_vm_page_mkwrite() f2fs_direct_IO (read)
>> filemap_write_and_wait_range()
>> -> f2fs_get_blocks()
>> -> submit_bio()
>>
>> -> set_page_dirty()
>>
>> Is above race possible with current f2fs code?
>> i.e. f2fs_direct_IO could read the stale data from the blocks
>> which were allocated due to mmap fault?
>
> The faulted page is locked until the fault is fully processed so direct
> IO has to wait for that to complete first.

How about below parallelism?

f2fs_vm_page_mkwrite() f2fs_direct_IO (read)
filemap_write_and_wait_range()
-> down_read(->i_mmap_sem);
-> lock_page()
-> f2fs_get_blocks()
-> submit_bio()

-> set_page_dirty()

Can above DIO read not expose the stale data from block which was
allocated in f2fs_vm_page_mkwrite path?


>
>>
>> Am I missing something here?
>>
>> -ritesh
>>
>> On 11/26/19 1:27 PM, Damien Le Moal wrote:
>>> f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
>>> flag for a kiocb structure. However, the file system direct IO handler
>>> function f2fs_direct_IO() may have decided that a direct IO has to be
>>> exececuted as a buffered IO using the function f2fs_force_buffered_io().
>>> This is the case for instance for volumes including zoned block device
>>> and for unaligned write IOs with LFS mode enabled.
>>>
>>> These 2 different methods of identifying direct IOs can result in
>>> inconsistencies generating stale data access for direct reads after a
>>> direct IO write that is treated as a buffered write. Fix this
>>> inconsistency by combining the IOCB_DIRECT flag test with the result
>>> of f2fs_force_buffered_io().
>>>
>>> Reported-by: Javier Gonzalez <[email protected]>
>>> Signed-off-by: Damien Le Moal <[email protected]>
>>> ---
>>> fs/f2fs/data.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index 5755e897a5f0..8ac2d3b70022 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>> int flag;
>>> int err = 0;
>>> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>>> + bool do_direct_io = direct_io &&
>>> + !f2fs_force_buffered_io(inode, iocb, from);
>>>
>>> /* convert inline data for Direct I/O*/
>>> if (direct_io) {
>>> @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>> return err;
>>> }
>>>
>>> - if (direct_io && allow_outplace_dio(inode, iocb, from))
>>> + if (do_direct_io && allow_outplace_dio(inode, iocb, from))
>>> return 0;
>>>
>>> if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>>>
>>
>>
>
>

2019-11-29 03:39:21

by Javier González

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling

On 26.11.2019 15:44, Jaegeuk Kim wrote:
>On 11/26, Damien Le Moal wrote:
>> f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
>> flag for a kiocb structure. However, the file system direct IO handler
>> function f2fs_direct_IO() may have decided that a direct IO has to be
>> exececuted as a buffered IO using the function f2fs_force_buffered_io().
>> This is the case for instance for volumes including zoned block device
>> and for unaligned write IOs with LFS mode enabled.
>>
>> These 2 different methods of identifying direct IOs can result in
>> inconsistencies generating stale data access for direct reads after a
>> direct IO write that is treated as a buffered write. Fix this
>> inconsistency by combining the IOCB_DIRECT flag test with the result
>> of f2fs_force_buffered_io().
>>
>> Reported-by: Javier Gonzalez <[email protected]>
>> Signed-off-by: Damien Le Moal <[email protected]>
>> ---
>> fs/f2fs/data.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>> index 5755e897a5f0..8ac2d3b70022 100644
>> --- a/fs/f2fs/data.c
>> +++ b/fs/f2fs/data.c
>> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>> int flag;
>> int err = 0;
>> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>> + bool do_direct_io = direct_io &&
>> + !f2fs_force_buffered_io(inode, iocb, from);
>>
>> /* convert inline data for Direct I/O*/
>> if (direct_io) {
>> @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>> return err;
>> }
>>
>> - if (direct_io && allow_outplace_dio(inode, iocb, from))
>> + if (do_direct_io && allow_outplace_dio(inode, iocb, from))
>
>It seems f2fs_force_buffered_io() includes allow_outplace_dio().
>
>How about this?
>---
> fs/f2fs/data.c | 13 -------------
> fs/f2fs/file.c | 35 +++++++++++++++++++++++++----------
> 2 files changed, 25 insertions(+), 23 deletions(-)
>
>diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>index a034cd0ce021..fc40a72f7827 100644
>--- a/fs/f2fs/data.c
>+++ b/fs/f2fs/data.c
>@@ -1180,19 +1180,6 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> int err = 0;
> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>
>- /* convert inline data for Direct I/O*/
>- if (direct_io) {
>- err = f2fs_convert_inline_inode(inode);
>- if (err)
>- return err;
>- }
>-
>- if (direct_io && allow_outplace_dio(inode, iocb, from))
>- return 0;
>-
>- if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>- return 0;
>-
> map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
> map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
> if (map.m_len > map.m_lblk)
>diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>index c0560d62dbee..6b32ac6c3382 100644
>--- a/fs/f2fs/file.c
>+++ b/fs/f2fs/file.c
>@@ -3386,18 +3386,33 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
> ret = -EAGAIN;
> goto out;
> }
>- } else {
>- preallocated = true;
>- target_size = iocb->ki_pos + iov_iter_count(from);
>+ goto write;
>+ }
>
>- err = f2fs_preallocate_blocks(iocb, from);
>- if (err) {
>- clear_inode_flag(inode, FI_NO_PREALLOC);
>- inode_unlock(inode);
>- ret = err;
>- goto out;
>- }
>+ if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>+ goto write;
>+
>+ if (iocb->ki_flags & IOCB_DIRECT) {
>+ /* convert inline data for Direct I/O*/
>+ err = f2fs_convert_inline_inode(inode);
>+ if (err)
>+ goto out_err;
>+
>+ if (!f2fs_force_buffered_io(inode, iocb, from))
>+ goto write;
>+ }
>+ preallocated = true;
>+ target_size = iocb->ki_pos + iov_iter_count(from);
>+
>+ err = f2fs_preallocate_blocks(iocb, from);
>+ if (err) {
>+out_err:
>+ clear_inode_flag(inode, FI_NO_PREALLOC);
>+ inode_unlock(inode);
>+ ret = err;
>+ goto out;
> }
>+write:
> ret = __generic_file_write_iter(iocb, from);
> clear_inode_flag(inode, FI_NO_PREALLOC);
>
>--
>2.19.0.605.g01d371f741-goog
>
This also addresses the original problem.

Tested-by: Javier González <[email protected]>

2019-11-30 06:47:17

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling

On 2019/11/27 7:44, Jaegeuk Kim wrote:
> On 11/26, Damien Le Moal wrote:
>> f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
>> flag for a kiocb structure. However, the file system direct IO handler
>> function f2fs_direct_IO() may have decided that a direct IO has to be
>> exececuted as a buffered IO using the function f2fs_force_buffered_io().
>> This is the case for instance for volumes including zoned block device
>> and for unaligned write IOs with LFS mode enabled.
>>
>> These 2 different methods of identifying direct IOs can result in
>> inconsistencies generating stale data access for direct reads after a
>> direct IO write that is treated as a buffered write. Fix this
>> inconsistency by combining the IOCB_DIRECT flag test with the result
>> of f2fs_force_buffered_io().
>>
>> Reported-by: Javier Gonzalez <[email protected]>
>> Signed-off-by: Damien Le Moal <[email protected]>
>> ---
>> fs/f2fs/data.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>> index 5755e897a5f0..8ac2d3b70022 100644
>> --- a/fs/f2fs/data.c
>> +++ b/fs/f2fs/data.c
>> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>> int flag;
>> int err = 0;
>> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>> + bool do_direct_io = direct_io &&
>> + !f2fs_force_buffered_io(inode, iocb, from);
>>
>> /* convert inline data for Direct I/O*/
>> if (direct_io) {
>> @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>> return err;
>> }
>>
>> - if (direct_io && allow_outplace_dio(inode, iocb, from))
>> + if (do_direct_io && allow_outplace_dio(inode, iocb, from))
>
> It seems f2fs_force_buffered_io() includes allow_outplace_dio().
>
> How about this?
> ---
> fs/f2fs/data.c | 13 -------------
> fs/f2fs/file.c | 35 +++++++++++++++++++++++++----------
> 2 files changed, 25 insertions(+), 23 deletions(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index a034cd0ce021..fc40a72f7827 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -1180,19 +1180,6 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> int err = 0;
> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>
> - /* convert inline data for Direct I/O*/
> - if (direct_io) {
> - err = f2fs_convert_inline_inode(inode);
> - if (err)
> - return err;
> - }
> -
> - if (direct_io && allow_outplace_dio(inode, iocb, from))
> - return 0;
> -
> - if (is_inode_flag_set(inode, FI_NO_PREALLOC))
> - return 0;
> -
> map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
> map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
> if (map.m_len > map.m_lblk)
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index c0560d62dbee..6b32ac6c3382 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -3386,18 +3386,33 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
> ret = -EAGAIN;
> goto out;
> }
> - } else {
> - preallocated = true;
> - target_size = iocb->ki_pos + iov_iter_count(from);
> + goto write;
> + }
>
> - err = f2fs_preallocate_blocks(iocb, from);
> - if (err) {
> - clear_inode_flag(inode, FI_NO_PREALLOC);
> - inode_unlock(inode);
> - ret = err;
> - goto out;
> - }
> + if (is_inode_flag_set(inode, FI_NO_PREALLOC))
> + goto write;
> +
> + if (iocb->ki_flags & IOCB_DIRECT) {
> + /* convert inline data for Direct I/O*/

Minor thing.

I/O */

> + err = f2fs_convert_inline_inode(inode);
> + if (err)
> + goto out_err;
> +
> + if (!f2fs_force_buffered_io(inode, iocb, from))
> + goto write;

We can call f2fs_convert_inline_inode() here to avoid unneeded inline
conversion.

Thanks,

> + }
> + preallocated = true;
> + target_size = iocb->ki_pos + iov_iter_count(from);
> +
> + err = f2fs_preallocate_blocks(iocb, from);
> + if (err) {
> +out_err:
> + clear_inode_flag(inode, FI_NO_PREALLOC);
> + inode_unlock(inode);
> + ret = err;
> + goto out;
> }
> +write:
> ret = __generic_file_write_iter(iocb, from);
> clear_inode_flag(inode, FI_NO_PREALLOC);
>
>

2019-11-30 07:23:54

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling

-Cc fsdevel mailing list

On 2019/11/28 10:10, Damien Le Moal wrote:
> On 2019/11/26 17:34, Ritesh Harjani wrote:
>> Hello Damien,
>>
>> IIUC, you are trying to fix a stale data read by DIO read for the case
>> you explained in your patch w.r.t. DIO-write forced to write as buffIO.
>>
>> Coincidentally I was just looking at the same code path just now.
>> So I do have a query to you/f2fs group. Below could be silly one, as I
>> don't understand F2FS in great detail.
>>
>> How is the stale data by DIO read, is protected against a mmap
>> writes via f2fs_vm_page_mkwrite?
>>
>> f2fs_vm_page_mkwrite() f2fs_direct_IO (read)
>> filemap_write_and_wait_range()
- writepages
lock_page
- writepage
unlock_page
lock_page
>> -> f2fs_get_blocks()

- f2fs_map_blocks

>> -> submit_bio()
>>
>> -> set_page_dirty()

unlock_page

I guess lock range is as above, so the race can happen, however,
1) If mkwrite() creates data in hole, direct_IO->f2fs_map_blocks should
return NEW_ADDR, which means that is a hole of file, so direct_IO should
read all zeroed data.
2) If mkwrite() overwrite data in block, mkwrite->f2fs_get_blocks won't
change old block address, then direct_IO->f2fs_map_blocks could get that
block address, and won't read stale data.

But I doubt could we read stale data with below race condition:

kworker DIO reader
- writepages
- f2fs_map_blocks
- get old block address
- writepage
trigger OPU, update old block address to new one

someone trigger checkpoint, data in old block address becomes stale,
then anyone else can write data into there.
- submit_bio
get stale data

Or am I missing something that maybe vfs has did such synchronization.

Thanks,

>>
>> Is above race possible with current f2fs code?
>> i.e. f2fs_direct_IO could read the stale data from the blocks
>> which were allocated due to mmap fault?
>
> The faulted page is locked until the fault is fully processed so direct
> IO has to wait for that to complete first.
>
>>
>> Am I missing something here?
>>
>> -ritesh
>>
>> On 11/26/19 1:27 PM, Damien Le Moal wrote:
>>> f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
>>> flag for a kiocb structure. However, the file system direct IO handler
>>> function f2fs_direct_IO() may have decided that a direct IO has to be
>>> exececuted as a buffered IO using the function f2fs_force_buffered_io().
>>> This is the case for instance for volumes including zoned block device
>>> and for unaligned write IOs with LFS mode enabled.
>>>
>>> These 2 different methods of identifying direct IOs can result in
>>> inconsistencies generating stale data access for direct reads after a
>>> direct IO write that is treated as a buffered write. Fix this
>>> inconsistency by combining the IOCB_DIRECT flag test with the result
>>> of f2fs_force_buffered_io().
>>>
>>> Reported-by: Javier Gonzalez <[email protected]>
>>> Signed-off-by: Damien Le Moal <[email protected]>
>>> ---
>>> fs/f2fs/data.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index 5755e897a5f0..8ac2d3b70022 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>> int flag;
>>> int err = 0;
>>> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>>> + bool do_direct_io = direct_io &&
>>> + !f2fs_force_buffered_io(inode, iocb, from);
>>>
>>> /* convert inline data for Direct I/O*/
>>> if (direct_io) {
>>> @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>> return err;
>>> }
>>>
>>> - if (direct_io && allow_outplace_dio(inode, iocb, from))
>>> + if (do_direct_io && allow_outplace_dio(inode, iocb, from))
>>> return 0;
>>>
>>> if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>>>
>>
>>
>
>

2019-11-30 07:29:04

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH] f2fs: Fix direct IO handling

On 2019/11/28 18:20, Ritesh Harjani wrote:
>
>
> On 11/28/19 7:40 AM, Damien Le Moal wrote:
>> On 2019/11/26 17:34, Ritesh Harjani wrote:
>>> Hello Damien,
>>>
>>> IIUC, you are trying to fix a stale data read by DIO read for the case
>>> you explained in your patch w.r.t. DIO-write forced to write as buffIO.
>>>
>>> Coincidentally I was just looking at the same code path just now.
>>> So I do have a query to you/f2fs group. Below could be silly one, as I
>>> don't understand F2FS in great detail.
>>>
>>> How is the stale data by DIO read, is protected against a mmap
>>> writes via f2fs_vm_page_mkwrite?
>>>
>>> f2fs_vm_page_mkwrite() f2fs_direct_IO (read)
>>> filemap_write_and_wait_range()
>>> -> f2fs_get_blocks()
>>> -> submit_bio()
>>>
>>> -> set_page_dirty()
>>>
>>> Is above race possible with current f2fs code?
>>> i.e. f2fs_direct_IO could read the stale data from the blocks
>>> which were allocated due to mmap fault?
>>
>> The faulted page is locked until the fault is fully processed so direct
>> IO has to wait for that to complete first.
>
> How about below parallelism?
>
> f2fs_vm_page_mkwrite() f2fs_direct_IO (read)
> filemap_write_and_wait_range()
> -> down_read(->i_mmap_sem);
> -> lock_page()
> -> f2fs_get_blocks()
> -> submit_bio()
>
> -> set_page_dirty()
>
> Can above DIO read not expose the stale data from block which was
> allocated in f2fs_vm_page_mkwrite path?

The race can happen, however I doubt the race condition is more complicated
as I described in previous reply of mine, could you check that?

Thanks,

>
>
>>
>>>
>>> Am I missing something here?
>>>
>>> -ritesh
>>>
>>> On 11/26/19 1:27 PM, Damien Le Moal wrote:
>>>> f2fs_preallocate_blocks() identifies direct IOs using the IOCB_DIRECT
>>>> flag for a kiocb structure. However, the file system direct IO handler
>>>> function f2fs_direct_IO() may have decided that a direct IO has to be
>>>> exececuted as a buffered IO using the function f2fs_force_buffered_io().
>>>> This is the case for instance for volumes including zoned block device
>>>> and for unaligned write IOs with LFS mode enabled.
>>>>
>>>> These 2 different methods of identifying direct IOs can result in
>>>> inconsistencies generating stale data access for direct reads after a
>>>> direct IO write that is treated as a buffered write. Fix this
>>>> inconsistency by combining the IOCB_DIRECT flag test with the result
>>>> of f2fs_force_buffered_io().
>>>>
>>>> Reported-by: Javier Gonzalez <[email protected]>
>>>> Signed-off-by: Damien Le Moal <[email protected]>
>>>> ---
>>>> fs/f2fs/data.c | 4 +++-
>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>>> index 5755e897a5f0..8ac2d3b70022 100644
>>>> --- a/fs/f2fs/data.c
>>>> +++ b/fs/f2fs/data.c
>>>> @@ -1073,6 +1073,8 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>>> int flag;
>>>> int err = 0;
>>>> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>>>> + bool do_direct_io = direct_io &&
>>>> + !f2fs_force_buffered_io(inode, iocb, from);
>>>>
>>>> /* convert inline data for Direct I/O*/
>>>> if (direct_io) {
>>>> @@ -1081,7 +1083,7 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
>>>> return err;
>>>> }
>>>>
>>>> - if (direct_io && allow_outplace_dio(inode, iocb, from))
>>>> + if (do_direct_io && allow_outplace_dio(inode, iocb, from))
>>>> return 0;
>>>>
>>>> if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>>>>
>>>
>>>
>>
>>
>
> .
>

2019-12-03 17:34:24

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH v2] f2fs: Fix direct IO handling

Thank you for checking the patch.
I found some regressions in xfstests, so want to follow the Damien's one
like below.

Thanks,

===
From 9df6f09e3a09ed804aba4b56ff7cd9524c002e69 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <[email protected]>
Date: Tue, 26 Nov 2019 15:01:42 -0800
Subject: [PATCH] f2fs: preallocate DIO blocks when forcing buffered_io

The previous preallocation and DIO decision like below.

allow_outplace_dio !allow_outplace_dio
f2fs_force_buffered_io (*) No_Prealloc / Buffered_IO Prealloc / Buffered_IO
!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO

But, Javier reported Case (*) where zoned device bypassed preallocation but
fell back to buffered writes in f2fs_direct_IO(), resulting in stale data
being read.

In order to fix the issue, actually we need to preallocate blocks whenever
we fall back to buffered IO like this. No change is made in the other cases.

allow_outplace_dio !allow_outplace_dio
f2fs_force_buffered_io (*) Prealloc / Buffered_IO Prealloc / Buffered_IO
!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO

Reported-and-tested-by: Javier Gonzalez <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
Tested-by: Shin'ichiro Kawasaki <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
---
fs/f2fs/data.c | 13 -------------
fs/f2fs/file.c | 43 +++++++++++++++++++++++++++++++++----------
2 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index a034cd0ce021..fc40a72f7827 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1180,19 +1180,6 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
int err = 0;
bool direct_io = iocb->ki_flags & IOCB_DIRECT;

- /* convert inline data for Direct I/O*/
- if (direct_io) {
- err = f2fs_convert_inline_inode(inode);
- if (err)
- return err;
- }
-
- if (direct_io && allow_outplace_dio(inode, iocb, from))
- return 0;
-
- if (is_inode_flag_set(inode, FI_NO_PREALLOC))
- return 0;
-
map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
if (map.m_len > map.m_lblk)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index c0560d62dbee..0e1b12a4a4d6 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -3386,18 +3386,41 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
ret = -EAGAIN;
goto out;
}
- } else {
- preallocated = true;
- target_size = iocb->ki_pos + iov_iter_count(from);
+ goto write;
+ }

- err = f2fs_preallocate_blocks(iocb, from);
- if (err) {
- clear_inode_flag(inode, FI_NO_PREALLOC);
- inode_unlock(inode);
- ret = err;
- goto out;
- }
+ if (is_inode_flag_set(inode, FI_NO_PREALLOC))
+ goto write;
+
+ if (iocb->ki_flags & IOCB_DIRECT) {
+ /*
+ * Convert inline data for Direct I/O before entering
+ * f2fs_direct_IO().
+ */
+ err = f2fs_convert_inline_inode(inode);
+ if (err)
+ goto out_err;
+ /*
+ * If force_buffere_io() is true, we have to allocate
+ * blocks all the time, since f2fs_direct_IO will fall
+ * back to buffered IO.
+ */
+ if (!f2fs_force_buffered_io(inode, iocb, from) &&
+ allow_outplace_dio(inode, iocb, from))
+ goto write;
+ }
+ preallocated = true;
+ target_size = iocb->ki_pos + iov_iter_count(from);
+
+ err = f2fs_preallocate_blocks(iocb, from);
+ if (err) {
+out_err:
+ clear_inode_flag(inode, FI_NO_PREALLOC);
+ inode_unlock(inode);
+ ret = err;
+ goto out;
}
+write:
ret = __generic_file_write_iter(iocb, from);
clear_inode_flag(inode, FI_NO_PREALLOC);

--
2.19.0.605.g01d371f741-goog


2019-12-04 01:28:46

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH v2] f2fs: Fix direct IO handling

On 2019/12/4 1:33, Jaegeuk Kim wrote:
> Thank you for checking the patch.
> I found some regressions in xfstests, so want to follow the Damien's one
> like below.
>
> Thanks,
>
> ===
>>From 9df6f09e3a09ed804aba4b56ff7cd9524c002e69 Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim <[email protected]>
> Date: Tue, 26 Nov 2019 15:01:42 -0800
> Subject: [PATCH] f2fs: preallocate DIO blocks when forcing buffered_io
>
> The previous preallocation and DIO decision like below.
>
> allow_outplace_dio !allow_outplace_dio
> f2fs_force_buffered_io (*) No_Prealloc / Buffered_IO Prealloc / Buffered_IO
> !f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO
>
> But, Javier reported Case (*) where zoned device bypassed preallocation but
> fell back to buffered writes in f2fs_direct_IO(), resulting in stale data
> being read.
>
> In order to fix the issue, actually we need to preallocate blocks whenever
> we fall back to buffered IO like this. No change is made in the other cases.
>
> allow_outplace_dio !allow_outplace_dio
> f2fs_force_buffered_io (*) Prealloc / Buffered_IO Prealloc / Buffered_IO
> !f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO
>
> Reported-and-tested-by: Javier Gonzalez <[email protected]>
> Signed-off-by: Damien Le Moal <[email protected]>
> Tested-by: Shin'ichiro Kawasaki <[email protected]>
> Signed-off-by: Jaegeuk Kim <[email protected]>

Reviewed-by: Chao Yu <[email protected]>

Thanks,

2019-12-04 04:02:57

by Shinichiro Kawasaki

[permalink] [raw]
Subject: Re: [PATCH v2] f2fs: Fix direct IO handling

On Dec 03, 2019 / 09:33, Jaegeuk Kim wrote:
> Thank you for checking the patch.
> I found some regressions in xfstests, so want to follow the Damien's one
> like below.
>
> Thanks,
>
> ===
> From 9df6f09e3a09ed804aba4b56ff7cd9524c002e69 Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim <[email protected]>
> Date: Tue, 26 Nov 2019 15:01:42 -0800
> Subject: [PATCH] f2fs: preallocate DIO blocks when forcing buffered_io
>
> The previous preallocation and DIO decision like below.
>
> allow_outplace_dio !allow_outplace_dio
> f2fs_force_buffered_io (*) No_Prealloc / Buffered_IO Prealloc / Buffered_IO
> !f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO
>
> But, Javier reported Case (*) where zoned device bypassed preallocation but
> fell back to buffered writes in f2fs_direct_IO(), resulting in stale data
> being read.
>
> In order to fix the issue, actually we need to preallocate blocks whenever
> we fall back to buffered IO like this. No change is made in the other cases.
>
> allow_outplace_dio !allow_outplace_dio
> f2fs_force_buffered_io (*) Prealloc / Buffered_IO Prealloc / Buffered_IO
> !f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO
>
> Reported-and-tested-by: Javier Gonzalez <[email protected]>
> Signed-off-by: Damien Le Moal <[email protected]>
> Tested-by: Shin'ichiro Kawasaki <[email protected]>
> Signed-off-by: Jaegeuk Kim <[email protected]>

Using SMR disks, I reconfirmed that the reported problem goes away with this
modified patch also. Thanks.

> ---
> fs/f2fs/data.c | 13 -------------
> fs/f2fs/file.c | 43 +++++++++++++++++++++++++++++++++----------
> 2 files changed, 33 insertions(+), 23 deletions(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index a034cd0ce021..fc40a72f7827 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -1180,19 +1180,6 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> int err = 0;
> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>
> - /* convert inline data for Direct I/O*/
> - if (direct_io) {
> - err = f2fs_convert_inline_inode(inode);
> - if (err)
> - return err;
> - }
> -
> - if (direct_io && allow_outplace_dio(inode, iocb, from))
> - return 0;
> -
> - if (is_inode_flag_set(inode, FI_NO_PREALLOC))
> - return 0;
> -
> map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
> map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
> if (map.m_len > map.m_lblk)
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index c0560d62dbee..0e1b12a4a4d6 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -3386,18 +3386,41 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
> ret = -EAGAIN;
> goto out;
> }
> - } else {
> - preallocated = true;
> - target_size = iocb->ki_pos + iov_iter_count(from);
> + goto write;
> + }
>
> - err = f2fs_preallocate_blocks(iocb, from);
> - if (err) {
> - clear_inode_flag(inode, FI_NO_PREALLOC);
> - inode_unlock(inode);
> - ret = err;
> - goto out;
> - }
> + if (is_inode_flag_set(inode, FI_NO_PREALLOC))
> + goto write;
> +
> + if (iocb->ki_flags & IOCB_DIRECT) {
> + /*
> + * Convert inline data for Direct I/O before entering
> + * f2fs_direct_IO().
> + */
> + err = f2fs_convert_inline_inode(inode);
> + if (err)
> + goto out_err;
> + /*
> + * If force_buffere_io() is true, we have to allocate
> + * blocks all the time, since f2fs_direct_IO will fall
> + * back to buffered IO.
> + */
> + if (!f2fs_force_buffered_io(inode, iocb, from) &&
> + allow_outplace_dio(inode, iocb, from))
> + goto write;
> + }
> + preallocated = true;
> + target_size = iocb->ki_pos + iov_iter_count(from);
> +
> + err = f2fs_preallocate_blocks(iocb, from);
> + if (err) {
> +out_err:
> + clear_inode_flag(inode, FI_NO_PREALLOC);
> + inode_unlock(inode);
> + ret = err;
> + goto out;
> }
> +write:
> ret = __generic_file_write_iter(iocb, from);
> clear_inode_flag(inode, FI_NO_PREALLOC);
>
> --
> 2.19.0.605.g01d371f741-goog
>
>

--
Best Regards,
Shin'ichiro Kawasaki

2019-12-04 08:17:41

by Javier González

[permalink] [raw]
Subject: Re: [PATCH v2] f2fs: Fix direct IO handling

On 03.12.2019 09:33, Jaegeuk Kim wrote:
>Thank you for checking the patch.
>I found some regressions in xfstests, so want to follow the Damien's one
>like below.
>
>Thanks,
>
>===
>From 9df6f09e3a09ed804aba4b56ff7cd9524c002e69 Mon Sep 17 00:00:00 2001
>From: Jaegeuk Kim <[email protected]>
>Date: Tue, 26 Nov 2019 15:01:42 -0800
>Subject: [PATCH] f2fs: preallocate DIO blocks when forcing buffered_io
>
>The previous preallocation and DIO decision like below.
>
> allow_outplace_dio !allow_outplace_dio
>f2fs_force_buffered_io (*) No_Prealloc / Buffered_IO Prealloc / Buffered_IO
>!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO
>
>But, Javier reported Case (*) where zoned device bypassed preallocation but
>fell back to buffered writes in f2fs_direct_IO(), resulting in stale data
>being read.
>
>In order to fix the issue, actually we need to preallocate blocks whenever
>we fall back to buffered IO like this. No change is made in the other cases.
>
> allow_outplace_dio !allow_outplace_dio
>f2fs_force_buffered_io (*) Prealloc / Buffered_IO Prealloc / Buffered_IO
>!f2fs_force_buffered_io No_Prealloc / DIO Prealloc / DIO
>
>Reported-and-tested-by: Javier Gonzalez <[email protected]>
>Signed-off-by: Damien Le Moal <[email protected]>
>Tested-by: Shin'ichiro Kawasaki <[email protected]>
>Signed-off-by: Jaegeuk Kim <[email protected]>
>---
> fs/f2fs/data.c | 13 -------------
> fs/f2fs/file.c | 43 +++++++++++++++++++++++++++++++++----------
> 2 files changed, 33 insertions(+), 23 deletions(-)
>
>diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>index a034cd0ce021..fc40a72f7827 100644
>--- a/fs/f2fs/data.c
>+++ b/fs/f2fs/data.c
>@@ -1180,19 +1180,6 @@ int f2fs_preallocate_blocks(struct kiocb *iocb, struct iov_iter *from)
> int err = 0;
> bool direct_io = iocb->ki_flags & IOCB_DIRECT;
>
>- /* convert inline data for Direct I/O*/
>- if (direct_io) {
>- err = f2fs_convert_inline_inode(inode);
>- if (err)
>- return err;
>- }
>-
>- if (direct_io && allow_outplace_dio(inode, iocb, from))
>- return 0;
>-
>- if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>- return 0;
>-
> map.m_lblk = F2FS_BLK_ALIGN(iocb->ki_pos);
> map.m_len = F2FS_BYTES_TO_BLK(iocb->ki_pos + iov_iter_count(from));
> if (map.m_len > map.m_lblk)
>diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>index c0560d62dbee..0e1b12a4a4d6 100644
>--- a/fs/f2fs/file.c
>+++ b/fs/f2fs/file.c
>@@ -3386,18 +3386,41 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
> ret = -EAGAIN;
> goto out;
> }
>- } else {
>- preallocated = true;
>- target_size = iocb->ki_pos + iov_iter_count(from);
>+ goto write;
>+ }
>
>- err = f2fs_preallocate_blocks(iocb, from);
>- if (err) {
>- clear_inode_flag(inode, FI_NO_PREALLOC);
>- inode_unlock(inode);
>- ret = err;
>- goto out;
>- }
>+ if (is_inode_flag_set(inode, FI_NO_PREALLOC))
>+ goto write;
>+
>+ if (iocb->ki_flags & IOCB_DIRECT) {
>+ /*
>+ * Convert inline data for Direct I/O before entering
>+ * f2fs_direct_IO().
>+ */
>+ err = f2fs_convert_inline_inode(inode);
>+ if (err)
>+ goto out_err;
>+ /*
>+ * If force_buffere_io() is true, we have to allocate
>+ * blocks all the time, since f2fs_direct_IO will fall
>+ * back to buffered IO.
>+ */
>+ if (!f2fs_force_buffered_io(inode, iocb, from) &&
>+ allow_outplace_dio(inode, iocb, from))
>+ goto write;
>+ }
>+ preallocated = true;
>+ target_size = iocb->ki_pos + iov_iter_count(from);
>+
>+ err = f2fs_preallocate_blocks(iocb, from);
>+ if (err) {
>+out_err:
>+ clear_inode_flag(inode, FI_NO_PREALLOC);
>+ inode_unlock(inode);
>+ ret = err;
>+ goto out;
> }
>+write:
> ret = __generic_file_write_iter(iocb, from);
> clear_inode_flag(inode, FI_NO_PREALLOC);
>
>--
>2.19.0.605.g01d371f741-goog
>
>

Looks good to me. It also fixes the problem we see in our end.

Reviewed-by: Javier González <[email protected]>