2024-03-06 10:03:04

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH v1] fs/fuse: Fix missing FOLL_PIN for direct-io

On Tue, 29 Aug 2023 at 20:37, Lei Huang <[email protected]> wrote:
>
> Our user space filesystem relies on fuse to provide POSIX interface.
> In our test, a known string is written into a file and the content
> is read back later to verify correct data returned. We observed wrong
> data returned in read buffer in rare cases although correct data are
> stored in our filesystem.
>
> Fuse kernel module calls iov_iter_get_pages2() to get the physical
> pages of the user-space read buffer passed in read(). The pages are
> not pinned to avoid page migration. When page migration occurs, the
> consequence are two-folds.
>
> 1) Applications do not receive correct data in read buffer.
> 2) fuse kernel writes data into a wrong place.
>
> Using iov_iter_extract_pages() to pin pages fixes the issue in our
> test.
>
> An auxiliary variable "struct page **pt_pages" is used in the patch
> to prepare the 2nd parameter for iov_iter_extract_pages() since
> iov_iter_get_pages2() uses a different type for the 2nd parameter.
>
> Signed-off-by: Lei Huang <[email protected]>

Applied, with a modification to only unpin if
iov_iter_extract_will_pin() returns true.

Thanks,
Miklos


2024-03-06 11:16:17

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH v1] fs/fuse: Fix missing FOLL_PIN for direct-io



On 3/6/24 11:01, Miklos Szeredi wrote:
> On Tue, 29 Aug 2023 at 20:37, Lei Huang <[email protected]> wrote:
>>
>> Our user space filesystem relies on fuse to provide POSIX interface.
>> In our test, a known string is written into a file and the content
>> is read back later to verify correct data returned. We observed wrong
>> data returned in read buffer in rare cases although correct data are
>> stored in our filesystem.
>>
>> Fuse kernel module calls iov_iter_get_pages2() to get the physical
>> pages of the user-space read buffer passed in read(). The pages are
>> not pinned to avoid page migration. When page migration occurs, the
>> consequence are two-folds.
>>
>> 1) Applications do not receive correct data in read buffer.
>> 2) fuse kernel writes data into a wrong place.
>>
>> Using iov_iter_extract_pages() to pin pages fixes the issue in our
>> test.
>>
>> An auxiliary variable "struct page **pt_pages" is used in the patch
>> to prepare the 2nd parameter for iov_iter_extract_pages() since
>> iov_iter_get_pages2() uses a different type for the 2nd parameter.
>>
>> Signed-off-by: Lei Huang <[email protected]>
>
> Applied, with a modification to only unpin if
> iov_iter_extract_will_pin() returns true.

Hi Miklos,

do you have an idea if this needs to be back ported and to which kernel
version?
I had tried to reproduce data corruption with 4.18 - Lei wrote that he
could see issues with older kernels as well, but I never managed to
trigger anything on 4.18-RHEL. Typically I use ql-fstest
(https://github.com/bsbernd/ql-fstest) and even added random DIO as an
option - nothing report with weeks of run time. I could try again with
more recent kernels that have folios.

Thanks,
Bernd

2024-03-06 12:06:26

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH v1] fs/fuse: Fix missing FOLL_PIN for direct-io

On Wed, 6 Mar 2024 at 12:16, Bernd Schubert <[email protected]> wrote:
>
>
>
> On 3/6/24 11:01, Miklos Szeredi wrote:
> > On Tue, 29 Aug 2023 at 20:37, Lei Huang <[email protected]> wrote:
> >>
> >> Our user space filesystem relies on fuse to provide POSIX interface.
> >> In our test, a known string is written into a file and the content
> >> is read back later to verify correct data returned. We observed wrong
> >> data returned in read buffer in rare cases although correct data are
> >> stored in our filesystem.
> >>
> >> Fuse kernel module calls iov_iter_get_pages2() to get the physical
> >> pages of the user-space read buffer passed in read(). The pages are
> >> not pinned to avoid page migration. When page migration occurs, the
> >> consequence are two-folds.
> >>
> >> 1) Applications do not receive correct data in read buffer.
> >> 2) fuse kernel writes data into a wrong place.
> >>
> >> Using iov_iter_extract_pages() to pin pages fixes the issue in our
> >> test.
> >>
> >> An auxiliary variable "struct page **pt_pages" is used in the patch
> >> to prepare the 2nd parameter for iov_iter_extract_pages() since
> >> iov_iter_get_pages2() uses a different type for the 2nd parameter.
> >>
> >> Signed-off-by: Lei Huang <[email protected]>
> >
> > Applied, with a modification to only unpin if
> > iov_iter_extract_will_pin() returns true.
>
> Hi Miklos,
>
> do you have an idea if this needs to be back ported and to which kernel
> version?
> I had tried to reproduce data corruption with 4.18 - Lei wrote that he
> could see issues with older kernels as well, but I never managed to
> trigger anything on 4.18-RHEL. Typically I use ql-fstest
> (https://github.com/bsbernd/ql-fstest) and even added random DIO as an
> option - nothing report with weeks of run time. I could try again with
> more recent kernels that have folios.

I don't think that corruption will happen in real life. So I'm not
sure we need to bother with backporting, and definitely not before
when the infrastructure was introduced.

Thanks,
Miklos

2024-03-10 08:57:23

by Lei Huang

[permalink] [raw]
Subject: Re: [PATCH v1] fs/fuse: Fix missing FOLL_PIN for direct-io

Thank you very much, Miklos!

Yes. It is not easy to reproduce the issues in real applications. We
only observed the issue in our own testing tool which runs multiple
tests concurrently. We have not been able reproduce it with simple code yet.

-lei

On 3/6/24 07:05, Miklos Szeredi wrote:
> On Wed, 6 Mar 2024 at 12:16, Bernd Schubert <[email protected]> wrote:
>>
>>
>>
>> On 3/6/24 11:01, Miklos Szeredi wrote:
>>> On Tue, 29 Aug 2023 at 20:37, Lei Huang <[email protected]> wrote:
>>>>
>>>> Our user space filesystem relies on fuse to provide POSIX interface.
>>>> In our test, a known string is written into a file and the content
>>>> is read back later to verify correct data returned. We observed wrong
>>>> data returned in read buffer in rare cases although correct data are
>>>> stored in our filesystem.
>>>>
>>>> Fuse kernel module calls iov_iter_get_pages2() to get the physical
>>>> pages of the user-space read buffer passed in read(). The pages are
>>>> not pinned to avoid page migration. When page migration occurs, the
>>>> consequence are two-folds.
>>>>
>>>> 1) Applications do not receive correct data in read buffer.
>>>> 2) fuse kernel writes data into a wrong place.
>>>>
>>>> Using iov_iter_extract_pages() to pin pages fixes the issue in our
>>>> test.
>>>>
>>>> An auxiliary variable "struct page **pt_pages" is used in the patch
>>>> to prepare the 2nd parameter for iov_iter_extract_pages() since
>>>> iov_iter_get_pages2() uses a different type for the 2nd parameter.
>>>>
>>>> Signed-off-by: Lei Huang <[email protected]>
>>>
>>> Applied, with a modification to only unpin if
>>> iov_iter_extract_will_pin() returns true.
>>
>> Hi Miklos,
>>
>> do you have an idea if this needs to be back ported and to which kernel
>> version?
>> I had tried to reproduce data corruption with 4.18 - Lei wrote that he
>> could see issues with older kernels as well, but I never managed to
>> trigger anything on 4.18-RHEL. Typically I use ql-fstest
>> (https://github.com/bsbernd/ql-fstest) and even added random DIO as an
>> option - nothing report with weeks of run time. I could try again with
>> more recent kernels that have folios.
>
> I don't think that corruption will happen in real life. So I'm not
> sure we need to bother with backporting, and definitely not before
> when the infrastructure was introduced.
>
> Thanks,
> Miklos