2021-03-18 13:54:41

by Connor Kuehl

[permalink] [raw]
Subject: [PATCH 0/3] virtiofs: split requests that exceed virtqueue size

I also have a patch in this series that fixes a typo I saw while I was
in there.

For testing, I have been using a toy program that performs a readv or a
writev with a large number of iovecs that exceeds the number of
descriptors available to the virtqueue and observing the number of
scattergather lists generated don't exceed the size of the virtqueue.

Connor Kuehl (3):
virtio_ring: always warn when descriptor chain exceeds queue size
virtiofs: split requests that exceed virtqueue size
fuse: fix typo for fuse_conn.max_pages comment

drivers/virtio/virtio_ring.c | 7 ++++---
fs/fuse/fuse_i.h | 7 ++++++-
fs/fuse/inode.c | 7 +++++++
fs/fuse/virtio_fs.c | 14 ++++++++++++++
4 files changed, 31 insertions(+), 4 deletions(-)

--
2.30.2


2021-03-18 13:56:43

by Connor Kuehl

[permalink] [raw]
Subject: [PATCH 3/3] fuse: fix typo for fuse_conn.max_pages comment

'Maxmum' -> 'Maximum'

Signed-off-by: Connor Kuehl <[email protected]>
---
fs/fuse/fuse_i.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index f0e4ee906464..8bdee79ba593 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -552,7 +552,7 @@ struct fuse_conn {
/** Maximum write size */
unsigned max_write;

- /** Maxmum number of pages that can be used in a single request */
+ /** Maximum number of pages that can be used in a single request */
unsigned int max_pages;

#if IS_ENABLED(CONFIG_VIRTIO_FS)
--
2.30.2

2021-03-18 13:56:44

by Connor Kuehl

[permalink] [raw]
Subject: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

If an incoming FUSE request can't fit on the virtqueue, the request is
placed onto a workqueue so a worker can try to resubmit it later where
there will (hopefully) be space for it next time.

This is fine for requests that aren't larger than a virtqueue's maximum
capacity. However, if a request's size exceeds the maximum capacity of
the virtqueue (even if the virtqueue is empty), it will be doomed to a
life of being placed on the workqueue, removed, discovered it won't fit,
and placed on the workqueue yet again.

Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
Descriptors) of the virtio spec:

"A driver MUST NOT create a descriptor chain longer than the Queue
Size of the device."

To fix this, limit the number of pages FUSE will use for an overall
request. This way, each request can realistically fit on the virtqueue
when it is decomposed into a scattergather list and avoid violating
section 2.6.5.3.1 of the virtio spec.

Signed-off-by: Connor Kuehl <[email protected]>
---
fs/fuse/fuse_i.h | 5 +++++
fs/fuse/inode.c | 7 +++++++
fs/fuse/virtio_fs.c | 14 ++++++++++++++
3 files changed, 26 insertions(+)

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 68cca8d4db6e..f0e4ee906464 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -555,6 +555,11 @@ struct fuse_conn {
/** Maxmum number of pages that can be used in a single request */
unsigned int max_pages;

+#if IS_ENABLED(CONFIG_VIRTIO_FS)
+ /** Constrain ->max_pages to this value during feature negotiation */
+ unsigned int transport_capacity;
+#endif
+
/** Input queue */
struct fuse_iqueue iq;

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index b0e18b470e91..42cc72ba13d9 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1058,6 +1058,13 @@ static void process_init_reply(struct fuse_mount *fm, struct fuse_args *args,
fc->no_flock = 1;
}

+#if IS_ENABLED(CONFIG_VIRTIO_FS)
+ /* fuse_conn_init() sets this to zero for all others, this is
+ * explicitly set by virtio_fs.
+ */
+ if (fc->transport_capacity)
+ fc->max_pages = min_t(unsigned int, fc->max_pages, fc->transport_capacity);
+#endif
fm->sb->s_bdi->ra_pages =
min(fm->sb->s_bdi->ra_pages, ra_pages);
fc->minor = arg->minor;
diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index 8868ac31a3c0..a6ffba85d59a 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -18,6 +18,12 @@
#include <linux/uio.h>
#include "fuse_i.h"

+/* Used to help calculate the FUSE connection's max_pages limit for a request's
+ * size. Parts of the struct fuse_req are sliced into scattergather lists in
+ * addition to the pages used, so this can help account for that overhead.
+ */
+#define FUSE_HEADER_OVERHEAD 4
+
/* List of virtio-fs device instances and a lock for the list. Also provides
* mutual exclusion in device removal and mounting path
*/
@@ -1408,6 +1414,7 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
struct super_block *sb;
struct fuse_conn *fc;
struct fuse_mount *fm;
+ unsigned int virtqueue_size;
int err;

/* This gets a reference on virtio_fs object. This ptr gets installed
@@ -1435,6 +1442,13 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
fc->delete_stale = true;
fc->auto_submounts = true;

+ /* Tell FUSE to split requests that exceed the virtqueue's size */
+ virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq);
+ WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD);
+ fc->transport_capacity = min_t(unsigned int,
+ virtqueue_size - FUSE_HEADER_OVERHEAD,
+ FUSE_MAX_MAX_PAGES);
+
fsc->s_fs_info = fm;
sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc);
if (fsc->s_fs_info) {
--
2.30.2

2021-03-18 15:21:32

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote:
> If an incoming FUSE request can't fit on the virtqueue, the request is
> placed onto a workqueue so a worker can try to resubmit it later where
> there will (hopefully) be space for it next time.
>
> This is fine for requests that aren't larger than a virtqueue's maximum
> capacity. However, if a request's size exceeds the maximum capacity of
> the virtqueue (even if the virtqueue is empty), it will be doomed to a
> life of being placed on the workqueue, removed, discovered it won't fit,
> and placed on the workqueue yet again.
>
> Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
> Descriptors) of the virtio spec:
>
> "A driver MUST NOT create a descriptor chain longer than the Queue
> Size of the device."
>
> To fix this, limit the number of pages FUSE will use for an overall
> request. This way, each request can realistically fit on the virtqueue
> when it is decomposed into a scattergather list and avoid violating
> section 2.6.5.3.1 of the virtio spec.

I removed the conditional compilation and renamed the limit. Also made
virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below.

The virtio_ring patch in this series should probably go through the respective
subsystem tree.


Thanks,
Miklos

---
From: Connor Kuehl <[email protected]>
Subject: virtiofs: split requests that exceed virtqueue size
Date: Thu, 18 Mar 2021 08:52:22 -0500

If an incoming FUSE request can't fit on the virtqueue, the request is
placed onto a workqueue so a worker can try to resubmit it later where
there will (hopefully) be space for it next time.

This is fine for requests that aren't larger than a virtqueue's maximum
capacity. However, if a request's size exceeds the maximum capacity of the
virtqueue (even if the virtqueue is empty), it will be doomed to a life of
being placed on the workqueue, removed, discovered it won't fit, and placed
on the workqueue yet again.

Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
Descriptors) of the virtio spec:

"A driver MUST NOT create a descriptor chain longer than the Queue
Size of the device."

To fix this, limit the number of pages FUSE will use for an overall
request. This way, each request can realistically fit on the virtqueue
when it is decomposed into a scattergather list and avoid violating section
2.6.5.3.1 of the virtio spec.

Signed-off-by: Connor Kuehl <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
---
fs/fuse/fuse_i.h | 3 +++
fs/fuse/inode.c | 3 ++-
fs/fuse/virtio_fs.c | 19 +++++++++++++++++--
3 files changed, 22 insertions(+), 3 deletions(-)

--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -555,6 +555,9 @@ struct fuse_conn {
/** Maxmum number of pages that can be used in a single request */
unsigned int max_pages;

+ /** Constrain ->max_pages to this value during feature negotiation */
+ unsigned int max_pages_limit;
+
/** Input queue */
struct fuse_iqueue iq;

--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -712,6 +712,7 @@ void fuse_conn_init(struct fuse_conn *fc
fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
fc->user_ns = get_user_ns(user_ns);
fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ;
+ fc->max_pages_limit = FUSE_MAX_MAX_PAGES;

INIT_LIST_HEAD(&fc->mounts);
list_add(&fm->fc_entry, &fc->mounts);
@@ -1040,7 +1041,7 @@ static void process_init_reply(struct fu
fc->abort_err = 1;
if (arg->flags & FUSE_MAX_PAGES) {
fc->max_pages =
- min_t(unsigned int, FUSE_MAX_MAX_PAGES,
+ min_t(unsigned int, fc->max_pages_limit,
max_t(unsigned int, arg->max_pages, 1));
}
if (IS_ENABLED(CONFIG_FUSE_DAX) &&
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -18,6 +18,12 @@
#include <linux/uio.h>
#include "fuse_i.h"

+/* Used to help calculate the FUSE connection's max_pages limit for a request's
+ * size. Parts of the struct fuse_req are sliced into scattergather lists in
+ * addition to the pages used, so this can help account for that overhead.
+ */
+#define FUSE_HEADER_OVERHEAD 4
+
/* List of virtio-fs device instances and a lock for the list. Also provides
* mutual exclusion in device removal and mounting path
*/
@@ -1413,9 +1419,10 @@ static int virtio_fs_get_tree(struct fs_
{
struct virtio_fs *fs;
struct super_block *sb;
- struct fuse_conn *fc;
+ struct fuse_conn *fc = NULL;
struct fuse_mount *fm;
- int err;
+ unsigned int virtqueue_size;
+ int err = -EIO;

/* This gets a reference on virtio_fs object. This ptr gets installed
* in fc->iq->priv. Once fuse_conn is going away, it calls ->put()
@@ -1427,6 +1434,10 @@ static int virtio_fs_get_tree(struct fs_
return -EINVAL;
}

+ virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq);
+ if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD))
+ goto out_err;
+
err = -ENOMEM;
fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL);
if (!fc)
@@ -1442,6 +1453,10 @@ static int virtio_fs_get_tree(struct fs_
fc->delete_stale = true;
fc->auto_submounts = true;

+ /* Tell FUSE to split requests that exceed the virtqueue's size */
+ fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit,
+ virtqueue_size - FUSE_HEADER_OVERHEAD);
+
fsc->s_fs_info = fm;
sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc);
if (fsc->s_fs_info) {

2021-03-18 15:54:47

by Connor Kuehl

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On 3/18/21 10:17 AM, Miklos Szeredi wrote:
> I removed the conditional compilation and renamed the limit. Also made
> virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below.

Thanks, Miklos. I think it looks better with those changes.

> The virtio_ring patch in this series should probably go through the respective
> subsystem tree.

Makes sense. I've CC'd everyone that ./scripts/get_maintainers.pl
suggested for that patch on this entire series as well. Should I resend
patch #1 through just that subsystem to avoid confusion or wait to see
if it gets picked out of this series?

Thanks again,

Connor

2021-03-19 13:53:59

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote:
> If an incoming FUSE request can't fit on the virtqueue, the request is
> placed onto a workqueue so a worker can try to resubmit it later where
> there will (hopefully) be space for it next time.
>
> This is fine for requests that aren't larger than a virtqueue's maximum
> capacity. However, if a request's size exceeds the maximum capacity of
> the virtqueue (even if the virtqueue is empty), it will be doomed to a
> life of being placed on the workqueue, removed, discovered it won't fit,
> and placed on the workqueue yet again.
>
> Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
> Descriptors) of the virtio spec:
>
> "A driver MUST NOT create a descriptor chain longer than the Queue
> Size of the device."
>
> To fix this, limit the number of pages FUSE will use for an overall
> request. This way, each request can realistically fit on the virtqueue
> when it is decomposed into a scattergather list and avoid violating
> section 2.6.5.3.1 of the virtio spec.

Hi Connor,

So as of now if a request is bigger than what virtqueue can support,
it never gets dispatched and caller waits infinitely? So this patch
will fix it by forcing fuse to split the request. That sounds good.


[..]
> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> index 8868ac31a3c0..a6ffba85d59a 100644
> --- a/fs/fuse/virtio_fs.c
> +++ b/fs/fuse/virtio_fs.c
> @@ -18,6 +18,12 @@
> #include <linux/uio.h>
> #include "fuse_i.h"
>
> +/* Used to help calculate the FUSE connection's max_pages limit for a request's
> + * size. Parts of the struct fuse_req are sliced into scattergather lists in
> + * addition to the pages used, so this can help account for that overhead.
> + */
> +#define FUSE_HEADER_OVERHEAD 4

How did yo arrive at this overhead. Is it following.

- One sg element for fuse_in_header.
- One sg element for input arguments.
- One sg element for fuse_out_header.
- One sg element for output args.

Thanks
Vivek

2021-03-19 14:18:58

by Connor Kuehl

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On 3/19/21 8:49 AM, Vivek Goyal wrote:
> On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote:
>> If an incoming FUSE request can't fit on the virtqueue, the request is
>> placed onto a workqueue so a worker can try to resubmit it later where
>> there will (hopefully) be space for it next time.
>>
>> This is fine for requests that aren't larger than a virtqueue's maximum
>> capacity. However, if a request's size exceeds the maximum capacity of
>> the virtqueue (even if the virtqueue is empty), it will be doomed to a
>> life of being placed on the workqueue, removed, discovered it won't fit,
>> and placed on the workqueue yet again.
>>
>> Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
>> Descriptors) of the virtio spec:
>>
>> "A driver MUST NOT create a descriptor chain longer than the Queue
>> Size of the device."
>>
>> To fix this, limit the number of pages FUSE will use for an overall
>> request. This way, each request can realistically fit on the virtqueue
>> when it is decomposed into a scattergather list and avoid violating
>> section 2.6.5.3.1 of the virtio spec.
>
> Hi Connor,
>
> So as of now if a request is bigger than what virtqueue can support,
> it never gets dispatched and caller waits infinitely? So this patch
> will fix it by forcing fuse to split the request. That sounds good.

Right, in theory. Certain configurations make it easier to avoid this
from happening, such as using indirect descriptors; however, in that
case, the virtio spec says even if indirect descriptors are used, the
descriptor chain length shouldn't exceed the length of the queue's size
anyways. So having FUSE split the request also helps to uphold that
property.

This is my reading of the potential looping problem:

virtio_fs_wake_pending_and_unlock
calls
virtio_fs_enqueue_req
calls
virtqueue_add_sgs

virtqueue_add_sgs can return -ENOSPC if there aren't enough descriptors
available.

This error gets propagated back down to
virtio_fs_wake_pending_and_unlock which checks for this exact issue and
places the request on a workqueue to retry submission later.

Resubmission occurs in virtio_fs_request_dispatch_work, which does a
similar dance, where if the request fails with -ENOSPC it just puts it
back in the queue. However, for a sufficiently large request that would
exceed the capacity of the virtqueue (even when empty), no amount of
retrying will ever make it fit.

>
>
> [..]
>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
>> index 8868ac31a3c0..a6ffba85d59a 100644
>> --- a/fs/fuse/virtio_fs.c
>> +++ b/fs/fuse/virtio_fs.c
>> @@ -18,6 +18,12 @@
>> #include <linux/uio.h>
>> #include "fuse_i.h"
>>
>> +/* Used to help calculate the FUSE connection's max_pages limit for a request's
>> + * size. Parts of the struct fuse_req are sliced into scattergather lists in
>> + * addition to the pages used, so this can help account for that overhead.
>> + */
>> +#define FUSE_HEADER_OVERHEAD 4
>
> How did yo arrive at this overhead. Is it following.
>
> - One sg element for fuse_in_header.
> - One sg element for input arguments.
> - One sg element for fuse_out_header.
> - One sg element for output args.

Yes, that's exactly how I got to that number.

Connor


2021-03-20 20:06:24

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On Thu, Mar 18, 2021 at 10:52:14AM -0500, Connor Kuehl wrote:
> On 3/18/21 10:17 AM, Miklos Szeredi wrote:
> > I removed the conditional compilation and renamed the limit. Also made
> > virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below.
>
> Thanks, Miklos. I think it looks better with those changes.
>
> > The virtio_ring patch in this series should probably go through the respective
> > subsystem tree.
>
> Makes sense. I've CC'd everyone that ./scripts/get_maintainers.pl suggested
> for that patch on this entire series as well. Should I resend patch #1
> through just that subsystem to avoid confusion or wait to see if it gets
> picked out of this series?

Yes pls post separately. Thanks!

> Thanks again,
>
> Connor

2021-03-22 03:44:59

by Jason Wang

[permalink] [raw]
Subject: Re: [PATCH 3/3] fuse: fix typo for fuse_conn.max_pages comment


?? 2021/3/18 ????9:52, Connor Kuehl ะด??:
> 'Maxmum' -> 'Maximum'


Need a better log here.

With the commit log fixed.

Acked-by: Jason Wang <[email protected]>


>
> Signed-off-by: Connor Kuehl <[email protected]>
> ---
> fs/fuse/fuse_i.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index f0e4ee906464..8bdee79ba593 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -552,7 +552,7 @@ struct fuse_conn {
> /** Maximum write size */
> unsigned max_write;
>
> - /** Maxmum number of pages that can be used in a single request */
> + /** Maximum number of pages that can be used in a single request */
> unsigned int max_pages;
>
> #if IS_ENABLED(CONFIG_VIRTIO_FS)

2021-03-22 15:52:21

by Stefan Hajnoczi

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote:
> If an incoming FUSE request can't fit on the virtqueue, the request is
> placed onto a workqueue so a worker can try to resubmit it later where
> there will (hopefully) be space for it next time.
>
> This is fine for requests that aren't larger than a virtqueue's maximum
> capacity. However, if a request's size exceeds the maximum capacity of
> the virtqueue (even if the virtqueue is empty), it will be doomed to a
> life of being placed on the workqueue, removed, discovered it won't fit,
> and placed on the workqueue yet again.
>
> Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
> Descriptors) of the virtio spec:
>
> "A driver MUST NOT create a descriptor chain longer than the Queue
> Size of the device."
>
> To fix this, limit the number of pages FUSE will use for an overall
> request. This way, each request can realistically fit on the virtqueue
> when it is decomposed into a scattergather list and avoid violating
> section 2.6.5.3.1 of the virtio spec.
>
> Signed-off-by: Connor Kuehl <[email protected]>
> ---
> fs/fuse/fuse_i.h | 5 +++++
> fs/fuse/inode.c | 7 +++++++
> fs/fuse/virtio_fs.c | 14 ++++++++++++++
> 3 files changed, 26 insertions(+)

Nice that FUSE already has max_pages :-).

Reviewed-by: Stefan Hajnoczi <[email protected]>


Attachments:
(No filename) (1.36 kB)
signature.asc (499.00 B)
Download all attachments

2021-03-22 19:03:46

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On Thu, Mar 18, 2021 at 04:17:51PM +0100, Miklos Szeredi wrote:
> On Thu, Mar 18, 2021 at 08:52:22AM -0500, Connor Kuehl wrote:
> > If an incoming FUSE request can't fit on the virtqueue, the request is
> > placed onto a workqueue so a worker can try to resubmit it later where
> > there will (hopefully) be space for it next time.
> >
> > This is fine for requests that aren't larger than a virtqueue's maximum
> > capacity. However, if a request's size exceeds the maximum capacity of
> > the virtqueue (even if the virtqueue is empty), it will be doomed to a
> > life of being placed on the workqueue, removed, discovered it won't fit,
> > and placed on the workqueue yet again.
> >
> > Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
> > Descriptors) of the virtio spec:
> >
> > "A driver MUST NOT create a descriptor chain longer than the Queue
> > Size of the device."
> >
> > To fix this, limit the number of pages FUSE will use for an overall
> > request. This way, each request can realistically fit on the virtqueue
> > when it is decomposed into a scattergather list and avoid violating
> > section 2.6.5.3.1 of the virtio spec.
>
> I removed the conditional compilation and renamed the limit. Also made
> virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below.
>
> The virtio_ring patch in this series should probably go through the respective
> subsystem tree.
>
>
> Thanks,
> Miklos
>
> ---
> From: Connor Kuehl <[email protected]>
> Subject: virtiofs: split requests that exceed virtqueue size
> Date: Thu, 18 Mar 2021 08:52:22 -0500
>
> If an incoming FUSE request can't fit on the virtqueue, the request is
> placed onto a workqueue so a worker can try to resubmit it later where
> there will (hopefully) be space for it next time.
>
> This is fine for requests that aren't larger than a virtqueue's maximum
> capacity. However, if a request's size exceeds the maximum capacity of the
> virtqueue (even if the virtqueue is empty), it will be doomed to a life of
> being placed on the workqueue, removed, discovered it won't fit, and placed
> on the workqueue yet again.
>
> Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
> Descriptors) of the virtio spec:
>
> "A driver MUST NOT create a descriptor chain longer than the Queue
> Size of the device."
>
> To fix this, limit the number of pages FUSE will use for an overall
> request. This way, each request can realistically fit on the virtqueue
> when it is decomposed into a scattergather list and avoid violating section
> 2.6.5.3.1 of the virtio spec.
>
> Signed-off-by: Connor Kuehl <[email protected]>
> Signed-off-by: Miklos Szeredi <[email protected]>
> ---

Looks good to me.

Reviewed-by: Vivek Goyal <[email protected]>

Vivek

> fs/fuse/fuse_i.h | 3 +++
> fs/fuse/inode.c | 3 ++-
> fs/fuse/virtio_fs.c | 19 +++++++++++++++++--
> 3 files changed, 22 insertions(+), 3 deletions(-)
>
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -555,6 +555,9 @@ struct fuse_conn {
> /** Maxmum number of pages that can be used in a single request */
> unsigned int max_pages;
>
> + /** Constrain ->max_pages to this value during feature negotiation */
> + unsigned int max_pages_limit;
> +
> /** Input queue */
> struct fuse_iqueue iq;
>
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -712,6 +712,7 @@ void fuse_conn_init(struct fuse_conn *fc
> fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
> fc->user_ns = get_user_ns(user_ns);
> fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ;
> + fc->max_pages_limit = FUSE_MAX_MAX_PAGES;
>
> INIT_LIST_HEAD(&fc->mounts);
> list_add(&fm->fc_entry, &fc->mounts);
> @@ -1040,7 +1041,7 @@ static void process_init_reply(struct fu
> fc->abort_err = 1;
> if (arg->flags & FUSE_MAX_PAGES) {
> fc->max_pages =
> - min_t(unsigned int, FUSE_MAX_MAX_PAGES,
> + min_t(unsigned int, fc->max_pages_limit,
> max_t(unsigned int, arg->max_pages, 1));
> }
> if (IS_ENABLED(CONFIG_FUSE_DAX) &&
> --- a/fs/fuse/virtio_fs.c
> +++ b/fs/fuse/virtio_fs.c
> @@ -18,6 +18,12 @@
> #include <linux/uio.h>
> #include "fuse_i.h"
>
> +/* Used to help calculate the FUSE connection's max_pages limit for a request's
> + * size. Parts of the struct fuse_req are sliced into scattergather lists in
> + * addition to the pages used, so this can help account for that overhead.
> + */
> +#define FUSE_HEADER_OVERHEAD 4
> +
> /* List of virtio-fs device instances and a lock for the list. Also provides
> * mutual exclusion in device removal and mounting path
> */
> @@ -1413,9 +1419,10 @@ static int virtio_fs_get_tree(struct fs_
> {
> struct virtio_fs *fs;
> struct super_block *sb;
> - struct fuse_conn *fc;
> + struct fuse_conn *fc = NULL;
> struct fuse_mount *fm;
> - int err;
> + unsigned int virtqueue_size;
> + int err = -EIO;
>
> /* This gets a reference on virtio_fs object. This ptr gets installed
> * in fc->iq->priv. Once fuse_conn is going away, it calls ->put()
> @@ -1427,6 +1434,10 @@ static int virtio_fs_get_tree(struct fs_
> return -EINVAL;
> }
>
> + virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq);
> + if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD))
> + goto out_err;
> +
> err = -ENOMEM;
> fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL);
> if (!fc)
> @@ -1442,6 +1453,10 @@ static int virtio_fs_get_tree(struct fs_
> fc->delete_stale = true;
> fc->auto_submounts = true;
>
> + /* Tell FUSE to split requests that exceed the virtqueue's size */
> + fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit,
> + virtqueue_size - FUSE_HEADER_OVERHEAD);
> +
> fsc->s_fs_info = fm;
> sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc);
> if (fsc->s_fs_info) {
>

2021-03-24 15:11:41

by Connor Kuehl

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On 3/18/21 10:17 AM, Miklos Szeredi wrote:
> I removed the conditional compilation and renamed the limit. Also made
> virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below.

Hi Miklos,

Has this patch been queued?

Connor

> ---
> From: Connor Kuehl <[email protected]>
> Subject: virtiofs: split requests that exceed virtqueue size
> Date: Thu, 18 Mar 2021 08:52:22 -0500
>
> If an incoming FUSE request can't fit on the virtqueue, the request is
> placed onto a workqueue so a worker can try to resubmit it later where
> there will (hopefully) be space for it next time.
>
> This is fine for requests that aren't larger than a virtqueue's maximum
> capacity. However, if a request's size exceeds the maximum capacity of the
> virtqueue (even if the virtqueue is empty), it will be doomed to a life of
> being placed on the workqueue, removed, discovered it won't fit, and placed
> on the workqueue yet again.
>
> Furthermore, from section 2.6.5.3.1 (Driver Requirements: Indirect
> Descriptors) of the virtio spec:
>
> "A driver MUST NOT create a descriptor chain longer than the Queue
> Size of the device."
>
> To fix this, limit the number of pages FUSE will use for an overall
> request. This way, each request can realistically fit on the virtqueue
> when it is decomposed into a scattergather list and avoid violating section
> 2.6.5.3.1 of the virtio spec.
>
> Signed-off-by: Connor Kuehl <[email protected]>
> Signed-off-by: Miklos Szeredi <[email protected]>
> ---
> fs/fuse/fuse_i.h | 3 +++
> fs/fuse/inode.c | 3 ++-
> fs/fuse/virtio_fs.c | 19 +++++++++++++++++--
> 3 files changed, 22 insertions(+), 3 deletions(-)
>
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -555,6 +555,9 @@ struct fuse_conn {
> /** Maxmum number of pages that can be used in a single request */
> unsigned int max_pages;
>
> + /** Constrain ->max_pages to this value during feature negotiation */
> + unsigned int max_pages_limit;
> +
> /** Input queue */
> struct fuse_iqueue iq;
>
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -712,6 +712,7 @@ void fuse_conn_init(struct fuse_conn *fc
> fc->pid_ns = get_pid_ns(task_active_pid_ns(current));
> fc->user_ns = get_user_ns(user_ns);
> fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ;
> + fc->max_pages_limit = FUSE_MAX_MAX_PAGES;
>
> INIT_LIST_HEAD(&fc->mounts);
> list_add(&fm->fc_entry, &fc->mounts);
> @@ -1040,7 +1041,7 @@ static void process_init_reply(struct fu
> fc->abort_err = 1;
> if (arg->flags & FUSE_MAX_PAGES) {
> fc->max_pages =
> - min_t(unsigned int, FUSE_MAX_MAX_PAGES,
> + min_t(unsigned int, fc->max_pages_limit,
> max_t(unsigned int, arg->max_pages, 1));
> }
> if (IS_ENABLED(CONFIG_FUSE_DAX) &&
> --- a/fs/fuse/virtio_fs.c
> +++ b/fs/fuse/virtio_fs.c
> @@ -18,6 +18,12 @@
> #include <linux/uio.h>
> #include "fuse_i.h"
>
> +/* Used to help calculate the FUSE connection's max_pages limit for a request's
> + * size. Parts of the struct fuse_req are sliced into scattergather lists in
> + * addition to the pages used, so this can help account for that overhead.
> + */
> +#define FUSE_HEADER_OVERHEAD 4
> +
> /* List of virtio-fs device instances and a lock for the list. Also provides
> * mutual exclusion in device removal and mounting path
> */
> @@ -1413,9 +1419,10 @@ static int virtio_fs_get_tree(struct fs_
> {
> struct virtio_fs *fs;
> struct super_block *sb;
> - struct fuse_conn *fc;
> + struct fuse_conn *fc = NULL;
> struct fuse_mount *fm;
> - int err;
> + unsigned int virtqueue_size;
> + int err = -EIO;
>
> /* This gets a reference on virtio_fs object. This ptr gets installed
> * in fc->iq->priv. Once fuse_conn is going away, it calls ->put()
> @@ -1427,6 +1434,10 @@ static int virtio_fs_get_tree(struct fs_
> return -EINVAL;
> }
>
> + virtqueue_size = virtqueue_get_vring_size(fs->vqs[VQ_REQUEST].vq);
> + if (WARN_ON(virtqueue_size <= FUSE_HEADER_OVERHEAD))
> + goto out_err;
> +
> err = -ENOMEM;
> fc = kzalloc(sizeof(struct fuse_conn), GFP_KERNEL);
> if (!fc)
> @@ -1442,6 +1453,10 @@ static int virtio_fs_get_tree(struct fs_
> fc->delete_stale = true;
> fc->auto_submounts = true;
>
> + /* Tell FUSE to split requests that exceed the virtqueue's size */
> + fc->max_pages_limit = min_t(unsigned int, fc->max_pages_limit,
> + virtqueue_size - FUSE_HEADER_OVERHEAD);
> +
> fsc->s_fs_info = fm;
> sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc);
> if (fsc->s_fs_info) {
>

2021-03-24 15:33:43

by Connor Kuehl

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On 3/24/21 10:30 AM, Miklos Szeredi wrote:
> On Wed, Mar 24, 2021 at 4:09 PM Connor Kuehl <[email protected]> wrote:
>>
>> On 3/18/21 10:17 AM, Miklos Szeredi wrote:
>>> I removed the conditional compilation and renamed the limit. Also made
>>> virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below.
>>
>> Hi Miklos,
>>
>> Has this patch been queued?
>
> It's in my internal patch queue at the moment. Will push to
> fuse.git#for-next in a couple of days.

Cool! Thank you :-)

Connor

2021-03-25 03:14:04

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 2/3] virtiofs: split requests that exceed virtqueue size

On Wed, Mar 24, 2021 at 4:09 PM Connor Kuehl <[email protected]> wrote:
>
> On 3/18/21 10:17 AM, Miklos Szeredi wrote:
> > I removed the conditional compilation and renamed the limit. Also made
> > virtio_fs_get_tree() bail out if it hit the WARN_ON(). Updated patch below.
>
> Hi Miklos,
>
> Has this patch been queued?

It's in my internal patch queue at the moment. Will push to
fuse.git#for-next in a couple of days.

Thanks,
Miklos