2024-02-28 14:56:23

by Hou Tao

[permalink] [raw]
Subject: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio

From: Hou Tao <[email protected]>

Hi,

The patch set aims to fix the warning related to an abnormal size
parameter of kmalloc() in virtiofs. The warning occurred when attempting
to insert a 10MB sized kernel module kept in a virtiofs with cache
disabled. As analyzed in patch #1, the root cause is that the length of
the read buffer is no limited, and the read buffer is passed directly to
virtiofs through out_args[0].value. Therefore patch #1 limits the
length of the read buffer passed to virtiofs by using max_pages. However
it is not enough, because now the maximal value of max_pages is 256.
Consequently, when reading a 10MB-sized kernel module, the length of the
bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will
try to allocate 2MB from memory subsystem. The request for 2MB of
physically contiguous memory significantly stress the memory subsystem
and may fail indefinitely on hosts with fragmented memory. To address
this, patch #2~#5 use scattered pages in a bio_vec to replace the
kmalloc-allocated bounce buffer when the length of the bounce buffer for
KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the
allocation of the bounce buffer and sg array in virtiofs is that
GFP_ATOMIC is used even when the allocation occurs in a kworker context.
Therefore the last patch uses GFP_NOFS for the allocation of both sg
array and bounce buffer when initiated by the kworker. For more details,
please check the individual patches.

As usual, comments are always welcome.

Change Log:

v2:
* limit the length of ITER_KVEC dio by max_pages instead of the
newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC
dio being consistent with other rw operations.
* replace kmalloc-allocated bounce buffer by using a bounce buffer
backed by scattered pages when the length of the bounce buffer for
KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with
fragmented memory, the KVEC_ITER dio can be handled normally by
virtiofs. (Bernd Schubert)
* merge the GFP_NOFS patch [1] into this patch-set and use
memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS
(Benjamin Coddington)

v1: https://lore.kernel.org/linux-fsdevel/[email protected]/

[1]: https://lore.kernel.org/linux-fsdevel/[email protected]/

Hou Tao (6):
fuse: limit the length of ITER_KVEC dio by max_pages
virtiofs: move alloc/free of argbuf into separated helpers
virtiofs: factor out more common methods for argbuf
virtiofs: support bounce buffer backed by scattered pages
virtiofs: use scattered bounce buffer for ITER_KVEC dio
virtiofs: use GFP_NOFS when enqueuing request through kworker

fs/fuse/file.c | 12 +-
fs/fuse/virtio_fs.c | 336 +++++++++++++++++++++++++++++++++++++-------
2 files changed, 296 insertions(+), 52 deletions(-)

--
2.29.2



2024-04-08 07:48:39

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio

On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote:
> From: Hou Tao <[email protected]>
>
> Hi,
>
> The patch set aims to fix the warning related to an abnormal size
> parameter of kmalloc() in virtiofs. The warning occurred when attempting
> to insert a 10MB sized kernel module kept in a virtiofs with cache
> disabled. As analyzed in patch #1, the root cause is that the length of
> the read buffer is no limited, and the read buffer is passed directly to
> virtiofs through out_args[0].value. Therefore patch #1 limits the
> length of the read buffer passed to virtiofs by using max_pages. However
> it is not enough, because now the maximal value of max_pages is 256.
> Consequently, when reading a 10MB-sized kernel module, the length of the
> bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will
> try to allocate 2MB from memory subsystem. The request for 2MB of
> physically contiguous memory significantly stress the memory subsystem
> and may fail indefinitely on hosts with fragmented memory. To address
> this, patch #2~#5 use scattered pages in a bio_vec to replace the
> kmalloc-allocated bounce buffer when the length of the bounce buffer for
> KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the
> allocation of the bounce buffer and sg array in virtiofs is that
> GFP_ATOMIC is used even when the allocation occurs in a kworker context.
> Therefore the last patch uses GFP_NOFS for the allocation of both sg
> array and bounce buffer when initiated by the kworker. For more details,
> please check the individual patches.
>
> As usual, comments are always welcome.
>
> Change Log:

Bernd should I just merge the patchset as is?
It seems to fix a real problem and no one has the
time to work on a better fix .... WDYT?


> v2:
> * limit the length of ITER_KVEC dio by max_pages instead of the
> newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC
> dio being consistent with other rw operations.
> * replace kmalloc-allocated bounce buffer by using a bounce buffer
> backed by scattered pages when the length of the bounce buffer for
> KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with
> fragmented memory, the KVEC_ITER dio can be handled normally by
> virtiofs. (Bernd Schubert)
> * merge the GFP_NOFS patch [1] into this patch-set and use
> memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS
> (Benjamin Coddington)
>
> v1: https://lore.kernel.org/linux-fsdevel/[email protected]/
>
> [1]: https://lore.kernel.org/linux-fsdevel/[email protected]/
>
> Hou Tao (6):
> fuse: limit the length of ITER_KVEC dio by max_pages
> virtiofs: move alloc/free of argbuf into separated helpers
> virtiofs: factor out more common methods for argbuf
> virtiofs: support bounce buffer backed by scattered pages
> virtiofs: use scattered bounce buffer for ITER_KVEC dio
> virtiofs: use GFP_NOFS when enqueuing request through kworker
>
> fs/fuse/file.c | 12 +-
> fs/fuse/virtio_fs.c | 336 +++++++++++++++++++++++++++++++++++++-------
> 2 files changed, 296 insertions(+), 52 deletions(-)
>
> --
> 2.29.2


2024-04-09 01:48:37

by Hou Tao

[permalink] [raw]
Subject: Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio

Hi,

On 4/8/2024 3:45 PM, Michael S. Tsirkin wrote:
> On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote:
>> From: Hou Tao <[email protected]>
>>
>> Hi,
>>
>> The patch set aims to fix the warning related to an abnormal size
>> parameter of kmalloc() in virtiofs. The warning occurred when attempting
>> to insert a 10MB sized kernel module kept in a virtiofs with cache
>> disabled. As analyzed in patch #1, the root cause is that the length of
>> the read buffer is no limited, and the read buffer is passed directly to
>> virtiofs through out_args[0].value. Therefore patch #1 limits the
>> length of the read buffer passed to virtiofs by using max_pages. However
>> it is not enough, because now the maximal value of max_pages is 256.
>> Consequently, when reading a 10MB-sized kernel module, the length of the
>> bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will
>> try to allocate 2MB from memory subsystem. The request for 2MB of
>> physically contiguous memory significantly stress the memory subsystem
>> and may fail indefinitely on hosts with fragmented memory. To address
>> this, patch #2~#5 use scattered pages in a bio_vec to replace the
>> kmalloc-allocated bounce buffer when the length of the bounce buffer for
>> KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the
>> allocation of the bounce buffer and sg array in virtiofs is that
>> GFP_ATOMIC is used even when the allocation occurs in a kworker context.
>> Therefore the last patch uses GFP_NOFS for the allocation of both sg
>> array and bounce buffer when initiated by the kworker. For more details,
>> please check the individual patches.
>>
>> As usual, comments are always welcome.
>>
>> Change Log:
> Bernd should I just merge the patchset as is?
> It seems to fix a real problem and no one has the
> time to work on a better fix .... WDYT?

Sorry for the long delay. I am just start to prepare for v3. In v3, I
plan to avoid the unnecessary memory copy between fuse args and bio_vec.
Will post it before next week.
>
>
>> v2:
>> * limit the length of ITER_KVEC dio by max_pages instead of the
>> newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC
>> dio being consistent with other rw operations.
>> * replace kmalloc-allocated bounce buffer by using a bounce buffer
>> backed by scattered pages when the length of the bounce buffer for
>> KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with
>> fragmented memory, the KVEC_ITER dio can be handled normally by
>> virtiofs. (Bernd Schubert)
>> * merge the GFP_NOFS patch [1] into this patch-set and use
>> memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS
>> (Benjamin Coddington)
>>
>> v1: https://lore.kernel.org/linux-fsdevel/[email protected]/
>>
>> [1]: https://lore.kernel.org/linux-fsdevel/[email protected]/
>>
>> Hou Tao (6):
>> fuse: limit the length of ITER_KVEC dio by max_pages
>> virtiofs: move alloc/free of argbuf into separated helpers
>> virtiofs: factor out more common methods for argbuf
>> virtiofs: support bounce buffer backed by scattered pages
>> virtiofs: use scattered bounce buffer for ITER_KVEC dio
>> virtiofs: use GFP_NOFS when enqueuing request through kworker
>>
>> fs/fuse/file.c | 12 +-
>> fs/fuse/virtio_fs.c | 336 +++++++++++++++++++++++++++++++++++++-------
>> 2 files changed, 296 insertions(+), 52 deletions(-)
>>
>> --
>> 2.29.2


2024-04-22 20:06:55

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio

On Tue, Apr 09, 2024 at 09:48:08AM +0800, Hou Tao wrote:
> Hi,
>
> On 4/8/2024 3:45 PM, Michael S. Tsirkin wrote:
> > On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote:
> >> From: Hou Tao <[email protected]>
> >>
> >> Hi,
> >>
> >> The patch set aims to fix the warning related to an abnormal size
> >> parameter of kmalloc() in virtiofs. The warning occurred when attempting
> >> to insert a 10MB sized kernel module kept in a virtiofs with cache
> >> disabled. As analyzed in patch #1, the root cause is that the length of
> >> the read buffer is no limited, and the read buffer is passed directly to
> >> virtiofs through out_args[0].value. Therefore patch #1 limits the
> >> length of the read buffer passed to virtiofs by using max_pages. However
> >> it is not enough, because now the maximal value of max_pages is 256.
> >> Consequently, when reading a 10MB-sized kernel module, the length of the
> >> bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will
> >> try to allocate 2MB from memory subsystem. The request for 2MB of
> >> physically contiguous memory significantly stress the memory subsystem
> >> and may fail indefinitely on hosts with fragmented memory. To address
> >> this, patch #2~#5 use scattered pages in a bio_vec to replace the
> >> kmalloc-allocated bounce buffer when the length of the bounce buffer for
> >> KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the
> >> allocation of the bounce buffer and sg array in virtiofs is that
> >> GFP_ATOMIC is used even when the allocation occurs in a kworker context.
> >> Therefore the last patch uses GFP_NOFS for the allocation of both sg
> >> array and bounce buffer when initiated by the kworker. For more details,
> >> please check the individual patches.
> >>
> >> As usual, comments are always welcome.
> >>
> >> Change Log:
> > Bernd should I just merge the patchset as is?
> > It seems to fix a real problem and no one has the
> > time to work on a better fix .... WDYT?
>
> Sorry for the long delay. I am just start to prepare for v3. In v3, I
> plan to avoid the unnecessary memory copy between fuse args and bio_vec.
> Will post it before next week.

Didn't happen before this week apparently.

> >
> >
> >> v2:
> >> * limit the length of ITER_KVEC dio by max_pages instead of the
> >> newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC
> >> dio being consistent with other rw operations.
> >> * replace kmalloc-allocated bounce buffer by using a bounce buffer
> >> backed by scattered pages when the length of the bounce buffer for
> >> KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with
> >> fragmented memory, the KVEC_ITER dio can be handled normally by
> >> virtiofs. (Bernd Schubert)
> >> * merge the GFP_NOFS patch [1] into this patch-set and use
> >> memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS
> >> (Benjamin Coddington)
> >>
> >> v1: https://lore.kernel.org/linux-fsdevel/[email protected]/
> >>
> >> [1]: https://lore.kernel.org/linux-fsdevel/[email protected]/
> >>
> >> Hou Tao (6):
> >> fuse: limit the length of ITER_KVEC dio by max_pages
> >> virtiofs: move alloc/free of argbuf into separated helpers
> >> virtiofs: factor out more common methods for argbuf
> >> virtiofs: support bounce buffer backed by scattered pages
> >> virtiofs: use scattered bounce buffer for ITER_KVEC dio
> >> virtiofs: use GFP_NOFS when enqueuing request through kworker
> >>
> >> fs/fuse/file.c | 12 +-
> >> fs/fuse/virtio_fs.c | 336 +++++++++++++++++++++++++++++++++++++-------
> >> 2 files changed, 296 insertions(+), 52 deletions(-)
> >>
> >> --
> >> 2.29.2


2024-04-22 21:12:27

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio



On 4/22/24 22:06, Michael S. Tsirkin wrote:
> On Tue, Apr 09, 2024 at 09:48:08AM +0800, Hou Tao wrote:
>> Hi,
>>
>> On 4/8/2024 3:45 PM, Michael S. Tsirkin wrote:
>>> On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote:
>>>> From: Hou Tao <[email protected]>
>>>>
>>>> Hi,
>>>>
>>>> The patch set aims to fix the warning related to an abnormal size
>>>> parameter of kmalloc() in virtiofs. The warning occurred when attempting
>>>> to insert a 10MB sized kernel module kept in a virtiofs with cache
>>>> disabled. As analyzed in patch #1, the root cause is that the length of
>>>> the read buffer is no limited, and the read buffer is passed directly to
>>>> virtiofs through out_args[0].value. Therefore patch #1 limits the
>>>> length of the read buffer passed to virtiofs by using max_pages. However
>>>> it is not enough, because now the maximal value of max_pages is 256.
>>>> Consequently, when reading a 10MB-sized kernel module, the length of the
>>>> bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will
>>>> try to allocate 2MB from memory subsystem. The request for 2MB of
>>>> physically contiguous memory significantly stress the memory subsystem
>>>> and may fail indefinitely on hosts with fragmented memory. To address
>>>> this, patch #2~#5 use scattered pages in a bio_vec to replace the
>>>> kmalloc-allocated bounce buffer when the length of the bounce buffer for
>>>> KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the
>>>> allocation of the bounce buffer and sg array in virtiofs is that
>>>> GFP_ATOMIC is used even when the allocation occurs in a kworker context.
>>>> Therefore the last patch uses GFP_NOFS for the allocation of both sg
>>>> array and bounce buffer when initiated by the kworker. For more details,
>>>> please check the individual patches.
>>>>
>>>> As usual, comments are always welcome.
>>>>
>>>> Change Log:
>>> Bernd should I just merge the patchset as is?
>>> It seems to fix a real problem and no one has the
>>> time to work on a better fix .... WDYT?
>>
>> Sorry for the long delay. I am just start to prepare for v3. In v3, I
>> plan to avoid the unnecessary memory copy between fuse args and bio_vec.
>> Will post it before next week.
>
> Didn't happen before this week apparently.

Hi Michael,

sorry for my later reply, I had been totally busy for the last weeks as
well. Also I can't decide to merge it - I'm not the official fuse
maintainer...
From my point of view, patch 1 is just missing to set the actual limit
and then would be clear and easy back-portable bug fix.
Not promised, I will try it out if I find a bit time tomorrow.


Bernd

2024-04-23 13:25:55

by Hou Tao

[permalink] [raw]
Subject: Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio



On 4/23/2024 4:06 AM, Michael S. Tsirkin wrote:
> On Tue, Apr 09, 2024 at 09:48:08AM +0800, Hou Tao wrote:
>> Hi,
>>
>> On 4/8/2024 3:45 PM, Michael S. Tsirkin wrote:
>>> On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote:
>>>> From: Hou Tao <[email protected]>
>>>>
>>>> Hi,
>>>>
>>>> The patch set aims to fix the warning related to an abnormal size
>>>> parameter of kmalloc() in virtiofs. The warning occurred when attempting
>>>> to insert a 10MB sized kernel module kept in a virtiofs with cache
>>>> disabled. As analyzed in patch #1, the root cause is that the length of
>>>> the read buffer is no limited, and the read buffer is passed directly to
>>>> virtiofs through out_args[0].value. Therefore patch #1 limits the
>>>> length of the read buffer passed to virtiofs by using max_pages. However
>>>> it is not enough, because now the maximal value of max_pages is 256.
>>>> Consequently, when reading a 10MB-sized kernel module, the length of the
>>>> bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will
>>>> try to allocate 2MB from memory subsystem. The request for 2MB of
>>>> physically contiguous memory significantly stress the memory subsystem
>>>> and may fail indefinitely on hosts with fragmented memory. To address
>>>> this, patch #2~#5 use scattered pages in a bio_vec to replace the
>>>> kmalloc-allocated bounce buffer when the length of the bounce buffer for
>>>> KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the
>>>> allocation of the bounce buffer and sg array in virtiofs is that
>>>> GFP_ATOMIC is used even when the allocation occurs in a kworker context.
>>>> Therefore the last patch uses GFP_NOFS for the allocation of both sg
>>>> array and bounce buffer when initiated by the kworker. For more details,
>>>> please check the individual patches.
>>>>
>>>> As usual, comments are always welcome.
>>>>
>>>> Change Log:
>>> Bernd should I just merge the patchset as is?
>>> It seems to fix a real problem and no one has the
>>> time to work on a better fix .... WDYT?
>> Sorry for the long delay. I am just start to prepare for v3. In v3, I
>> plan to avoid the unnecessary memory copy between fuse args and bio_vec.
>> Will post it before next week.
> Didn't happen before this week apparently.

Sorry for failing to make it this week. Being busy these two weeks. Hope
to send v3 out before the end of April.
>
>>>
>>>> v2:
>>>> * limit the length of ITER_KVEC dio by max_pages instead of the
>>>> newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC
>>>> dio being consistent with other rw operations.
>>>> * replace kmalloc-allocated bounce buffer by using a bounce buffer
>>>> backed by scattered pages when the length of the bounce buffer for
>>>> KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with
>>>> fragmented memory, the KVEC_ITER dio can be handled normally by
>>>> virtiofs. (Bernd Schubert)
>>>> * merge the GFP_NOFS patch [1] into this patch-set and use
>>>> memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS
>>>> (Benjamin Coddington)
>>>>
>>>> v1: https://lore.kernel.org/linux-fsdevel/[email protected]/
>>>>
>>>> [1]: https://lore.kernel.org/linux-fsdevel/[email protected]/
>>>>
>>>> Hou Tao (6):
>>>> fuse: limit the length of ITER_KVEC dio by max_pages
>>>> virtiofs: move alloc/free of argbuf into separated helpers
>>>> virtiofs: factor out more common methods for argbuf
>>>> virtiofs: support bounce buffer backed by scattered pages
>>>> virtiofs: use scattered bounce buffer for ITER_KVEC dio
>>>> virtiofs: use GFP_NOFS when enqueuing request through kworker
>>>>
>>>> fs/fuse/file.c | 12 +-
>>>> fs/fuse/virtio_fs.c | 336 +++++++++++++++++++++++++++++++++++++-------
>>>> 2 files changed, 296 insertions(+), 52 deletions(-)
>>>>
>>>> --
>>>> 2.29.2