2024-02-12 17:20:00

by Steven Sistare

[permalink] [raw]
Subject: [PATCH V2 0/3] flush workers on suspend

Flush to guarantee no workers are running when suspend returns,
for vdpa, vpa_sim, and vduse. (mlx5 already does so, via the path
mlx5_vdpa_suspend -> unregister_link_notifier -> flush_workqueue.)

Steve Sistare (3):
vhost-vdpa: flush workers on suspend
vduse: suspend
vdpa_sim: flush workers on suspend

drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
drivers/vdpa/vdpa_user/vduse_dev.c | 24 ++++++++++++++++++++++++
drivers/vhost/vdpa.c | 3 +++
3 files changed, 40 insertions(+)

--
2.39.3



2024-02-12 17:20:11

by Steven Sistare

[permalink] [raw]
Subject: [PATCH V2 3/3] vdpa_sim: flush workers on suspend

Flush to guarantee no workers are running when suspend returns.

Signed-off-by: Steve Sistare <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index be2925d0d283..a662b90357c3 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim,
kthread_flush_work(work);
}

+static void flush_work_fn(struct kthread_work *work) {}
+
+static void vdpasim_flush_work(struct vdpasim *vdpasim)
+{
+ struct kthread_work work;
+
+ kthread_init_work(&work, flush_work_fn);
+ kthread_queue_work(vdpasim->worker, &work);
+ kthread_flush_work(&work);
+}
+
static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
{
return container_of(vdpa, struct vdpasim, vdpa);
@@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
vdpasim->running = false;
mutex_unlock(&vdpasim->mutex);

+ vdpasim_flush_work(vdpasim);
+
return 0;
}

--
2.39.3


2024-02-12 17:28:22

by Steven Sistare

[permalink] [raw]
Subject: [PATCH V2 1/3] vhost-vdpa: flush workers on suspend

Flush to guarantee no workers are running when suspend returns.

Signed-off-by: Steve Sistare <[email protected]>
---
drivers/vhost/vdpa.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index bc4a51e4638b..a3b986c24805 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -594,10 +594,13 @@ static long vhost_vdpa_suspend(struct vhost_vdpa *v)
struct vdpa_device *vdpa = v->vdpa;
const struct vdpa_config_ops *ops = vdpa->config;
int ret;
+ struct vhost_dev *vdev = &v->vdev;

if (!ops->suspend)
return -EOPNOTSUPP;

+ vhost_dev_flush(vdev);
+
ret = ops->suspend(vdpa);
if (!ret)
v->suspended = true;
--
2.39.3


2024-02-12 17:40:13

by Steven Sistare

[permalink] [raw]
Subject: [PATCH V2 2/3] vduse: suspend

Support the suspend operation. There is little to do, except flush to
guarantee no workers are running when suspend returns.

Signed-off-by: Steve Sistare <[email protected]>
---
drivers/vdpa/vdpa_user/vduse_dev.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c
index 1d24da79c399..503030f19e52 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -472,6 +472,18 @@ static void vduse_dev_reset(struct vduse_dev *dev)
up_write(&dev->rwsem);
}

+static void vduse_flush_work(struct vduse_dev *dev)
+{
+ flush_work(&dev->inject);
+
+ for (int i = 0; i < dev->vq_num; i++) {
+ struct vduse_virtqueue *vq = dev->vqs[i];
+
+ flush_work(&vq->inject);
+ flush_work(&vq->kick);
+ }
+}
+
static int vduse_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 idx,
u64 desc_area, u64 driver_area,
u64 device_area)
@@ -713,6 +725,17 @@ static int vduse_vdpa_reset(struct vdpa_device *vdpa)
return ret;
}

+static int vduse_vdpa_suspend(struct vdpa_device *vdpa)
+{
+ struct vduse_dev *dev = vdpa_to_vduse(vdpa);
+
+ down_write(&dev->rwsem);
+ vduse_flush_work(dev);
+ up_write(&dev->rwsem);
+
+ return 0;
+}
+
static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa)
{
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
@@ -794,6 +817,7 @@ static const struct vdpa_config_ops vduse_vdpa_config_ops = {
.set_vq_affinity = vduse_vdpa_set_vq_affinity,
.get_vq_affinity = vduse_vdpa_get_vq_affinity,
.reset = vduse_vdpa_reset,
+ .suspend = vduse_vdpa_suspend,
.set_map = vduse_vdpa_set_map,
.free = vduse_vdpa_free,
};
--
2.39.3


2024-02-13 15:16:00

by Steven Sistare

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] flush workers on suspend

On 2/12/2024 12:16 PM, Steve Sistare wrote:
> Flush to guarantee no workers are running when suspend returns,
> for vdpa, vpa_sim, and vduse. (mlx5 already does so, via the path
> mlx5_vdpa_suspend -> unregister_link_notifier -> flush_workqueue.)

Changes in V2:
- renamed "vduse: suspend" (was vduse: flush workers on suspend)
- call vhost_dev_flush unconditionally in "vhost-vdpa: flush workers on suspend"

- Steve

> Steve Sistare (3):
> vhost-vdpa: flush workers on suspend
> vduse: suspend
> vdpa_sim: flush workers on suspend
>
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
> drivers/vdpa/vdpa_user/vduse_dev.c | 24 ++++++++++++++++++++++++
> drivers/vhost/vdpa.c | 3 +++
> 3 files changed, 40 insertions(+)
>

2024-02-13 15:59:40

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [PATCH V2 1/3] vhost-vdpa: flush workers on suspend

On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <[email protected]> wrote:
>
> Flush to guarantee no workers are running when suspend returns.
>
> Signed-off-by: Steve Sistare <[email protected]>

Acked-by: Eugenio Pérez <[email protected]>

Should this have a Fixes tag?

> ---
> drivers/vhost/vdpa.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index bc4a51e4638b..a3b986c24805 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -594,10 +594,13 @@ static long vhost_vdpa_suspend(struct vhost_vdpa *v)
> struct vdpa_device *vdpa = v->vdpa;
> const struct vdpa_config_ops *ops = vdpa->config;
> int ret;
> + struct vhost_dev *vdev = &v->vdev;
>
> if (!ops->suspend)
> return -EOPNOTSUPP;
>
> + vhost_dev_flush(vdev);
> +
> ret = ops->suspend(vdpa);
> if (!ret)
> v->suspended = true;
> --
> 2.39.3
>


2024-02-13 16:11:43

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] vdpa_sim: flush workers on suspend

On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <[email protected]> wrote:
>
> Flush to guarantee no workers are running when suspend returns.
>
> Signed-off-by: Steve Sistare <[email protected]>
> ---
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index be2925d0d283..a662b90357c3 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim,
> kthread_flush_work(work);
> }
>
> +static void flush_work_fn(struct kthread_work *work) {}
> +
> +static void vdpasim_flush_work(struct vdpasim *vdpasim)
> +{
> + struct kthread_work work;
> +
> + kthread_init_work(&work, flush_work_fn);

If the work is already queued, doesn't it break the linked list
because of the memset in kthread_init_work?

> + kthread_queue_work(vdpasim->worker, &work);
> + kthread_flush_work(&work);
> +}
> +
> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
> {
> return container_of(vdpa, struct vdpasim, vdpa);
> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
> vdpasim->running = false;
> mutex_unlock(&vdpasim->mutex);
>
> + vdpasim_flush_work(vdpasim);

Do we need to protect the case where vdpasim_kick_vq and
vdpasim_suspend are called "at the same time"? Correct userland should
not be doing it but buggy or mailious could be. Just calling
vdpasim_flush_work with the mutex acquired would solve the issue,
doesn't it?

Thanks!

> +
> return 0;
> }
>
> --
> 2.39.3
>


2024-02-14 17:51:04

by Steven Sistare

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] vdpa_sim: flush workers on suspend

On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote:
> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <[email protected]> wrote:
>>
>> Flush to guarantee no workers are running when suspend returns.
>>
>> Signed-off-by: Steve Sistare <[email protected]>
>> ---
>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>> index be2925d0d283..a662b90357c3 100644
>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim,
>> kthread_flush_work(work);
>> }
>>
>> +static void flush_work_fn(struct kthread_work *work) {}
>> +
>> +static void vdpasim_flush_work(struct vdpasim *vdpasim)
>> +{
>> + struct kthread_work work;
>> +
>> + kthread_init_work(&work, flush_work_fn);
>
> If the work is already queued, doesn't it break the linked list
> because of the memset in kthread_init_work?

work is a local variable. It completes before vdpasim_flush_work returns,
thus is never already queued on entry to vdpasim_flush_work.
Am I missing your point?

>> + kthread_queue_work(vdpasim->worker, &work);
>> + kthread_flush_work(&work);
>> +}
>> +
>> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
>> {
>> return container_of(vdpa, struct vdpasim, vdpa);
>> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
>> vdpasim->running = false;
>> mutex_unlock(&vdpasim->mutex);
>>
>> + vdpasim_flush_work(vdpasim);
>
> Do we need to protect the case where vdpasim_kick_vq and
> vdpasim_suspend are called "at the same time"? Correct userland should
> not be doing it but buggy or mailious could be. Just calling
> vdpasim_flush_work with the mutex acquired would solve the issue,
> doesn't it?

Good catch. I need to serialize access to vdpasim->running plus the worker queue
in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called
from non-task contexts, I should define a new spinlock to be acquired in both functions.

- Steve

2024-02-14 18:18:26

by Steven Sistare

[permalink] [raw]
Subject: Re: [PATCH V2 1/3] vhost-vdpa: flush workers on suspend

On 2/13/2024 10:58 AM, Eugenio Perez Martin wrote:
> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <[email protected]> wrote:
>>
>> Flush to guarantee no workers are running when suspend returns.
>>
>> Signed-off-by: Steve Sistare <[email protected]>
>
> Acked-by: Eugenio Pérez <[email protected]>
>
> Should this have a Fixes tag?

Sure, I will add:

Fixes: f345a0143b4d ("vhost-vdpa: uAPI to suspend the device")

- Steve

>> ---
>> drivers/vhost/vdpa.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index bc4a51e4638b..a3b986c24805 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -594,10 +594,13 @@ static long vhost_vdpa_suspend(struct vhost_vdpa *v)
>> struct vdpa_device *vdpa = v->vdpa;
>> const struct vdpa_config_ops *ops = vdpa->config;
>> int ret;
>> + struct vhost_dev *vdev = &v->vdev;
>>
>> if (!ops->suspend)
>> return -EOPNOTSUPP;
>>
>> + vhost_dev_flush(vdev);
>> +
>> ret = ops->suspend(vdpa);
>> if (!ret)
>> v->suspended = true;
>> --
>> 2.39.3
>>
>

2024-02-14 19:40:19

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] vdpa_sim: flush workers on suspend

On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare
<[email protected]> wrote:
>
> On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote:
> > On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <[email protected]> wrote:
> >>
> >> Flush to guarantee no workers are running when suspend returns.
> >>
> >> Signed-off-by: Steve Sistare <[email protected]>
> >> ---
> >> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
> >> 1 file changed, 13 insertions(+)
> >>
> >> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> >> index be2925d0d283..a662b90357c3 100644
> >> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> >> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> >> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim,
> >> kthread_flush_work(work);
> >> }
> >>
> >> +static void flush_work_fn(struct kthread_work *work) {}
> >> +
> >> +static void vdpasim_flush_work(struct vdpasim *vdpasim)
> >> +{
> >> + struct kthread_work work;
> >> +
> >> + kthread_init_work(&work, flush_work_fn);
> >
> > If the work is already queued, doesn't it break the linked list
> > because of the memset in kthread_init_work?
>
> work is a local variable. It completes before vdpasim_flush_work returns,
> thus is never already queued on entry to vdpasim_flush_work.
> Am I missing your point?
>

No, sorry, I was the one missing that. Thanks for explaining it :)!

I'm not so used to the kthread queue, but why not calling
kthread_flush_work on vdpasim->work directly?

> >> + kthread_queue_work(vdpasim->worker, &work);
> >> + kthread_flush_work(&work);
> >> +}
> >> +
> >> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
> >> {
> >> return container_of(vdpa, struct vdpasim, vdpa);
> >> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
> >> vdpasim->running = false;
> >> mutex_unlock(&vdpasim->mutex);
> >>
> >> + vdpasim_flush_work(vdpasim);
> >
> > Do we need to protect the case where vdpasim_kick_vq and
> > vdpasim_suspend are called "at the same time"? Correct userland should
> > not be doing it but buggy or mailious could be. Just calling
> > vdpasim_flush_work with the mutex acquired would solve the issue,
> > doesn't it?
>
> Good catch. I need to serialize access to vdpasim->running plus the worker queue
> in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called
> from non-task contexts, I should define a new spinlock to be acquired in both functions.
>
> - Steve
>


2024-02-14 19:52:50

by Steven Sistare

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] vdpa_sim: flush workers on suspend

On 2/14/2024 2:39 PM, Eugenio Perez Martin wrote:
> On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare
> <[email protected]> wrote:
>>
>> On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote:
>>> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <[email protected]> wrote:
>>>>
>>>> Flush to guarantee no workers are running when suspend returns.
>>>>
>>>> Signed-off-by: Steve Sistare <[email protected]>
>>>> ---
>>>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
>>>> 1 file changed, 13 insertions(+)
>>>>
>>>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>>> index be2925d0d283..a662b90357c3 100644
>>>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>>> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim,
>>>> kthread_flush_work(work);
>>>> }
>>>>
>>>> +static void flush_work_fn(struct kthread_work *work) {}
>>>> +
>>>> +static void vdpasim_flush_work(struct vdpasim *vdpasim)
>>>> +{
>>>> + struct kthread_work work;
>>>> +
>>>> + kthread_init_work(&work, flush_work_fn);
>>>
>>> If the work is already queued, doesn't it break the linked list
>>> because of the memset in kthread_init_work?
>>
>> work is a local variable. It completes before vdpasim_flush_work returns,
>> thus is never already queued on entry to vdpasim_flush_work.
>> Am I missing your point?
>
> No, sorry, I was the one missing that. Thanks for explaining it :)!
>
> I'm not so used to the kthread queue, but why not calling
> kthread_flush_work on vdpasim->work directly?

vdpasim->work is not the only work posted to vdpasim->worker; see
vdpasim_worker_change_mm_sync. Posting a new no-op work guarantees
they are all flushed.

- Steve

>>>> + kthread_queue_work(vdpasim->worker, &work);
>>>> + kthread_flush_work(&work);
>>>> +}
>>>> +
>>>> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
>>>> {
>>>> return container_of(vdpa, struct vdpasim, vdpa);
>>>> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
>>>> vdpasim->running = false;
>>>> mutex_unlock(&vdpasim->mutex);
>>>>
>>>> + vdpasim_flush_work(vdpasim);
>>>
>>> Do we need to protect the case where vdpasim_kick_vq and
>>> vdpasim_suspend are called "at the same time"? Correct userland should
>>> not be doing it but buggy or mailious could be. Just calling
>>> vdpasim_flush_work with the mutex acquired would solve the issue,
>>> doesn't it?
>>
>> Good catch. I need to serialize access to vdpasim->running plus the worker queue
>> in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called
>> from non-task contexts, I should define a new spinlock to be acquired in both functions.
>>
>> - Steve
>>
>

2024-02-15 15:49:26

by Eugenio Perez Martin

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] vdpa_sim: flush workers on suspend

On Wed, Feb 14, 2024 at 8:52 PM Steven Sistare
<[email protected]> wrote:
>
> On 2/14/2024 2:39 PM, Eugenio Perez Martin wrote:
> > On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare
> > <[email protected]> wrote:
> >>
> >> On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote:
> >>> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <[email protected]> wrote:
> >>>>
> >>>> Flush to guarantee no workers are running when suspend returns.
> >>>>
> >>>> Signed-off-by: Steve Sistare <[email protected]>
> >>>> ---
> >>>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
> >>>> 1 file changed, 13 insertions(+)
> >>>>
> >>>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> >>>> index be2925d0d283..a662b90357c3 100644
> >>>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> >>>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> >>>> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim,
> >>>> kthread_flush_work(work);
> >>>> }
> >>>>
> >>>> +static void flush_work_fn(struct kthread_work *work) {}
> >>>> +
> >>>> +static void vdpasim_flush_work(struct vdpasim *vdpasim)
> >>>> +{
> >>>> + struct kthread_work work;
> >>>> +
> >>>> + kthread_init_work(&work, flush_work_fn);
> >>>
> >>> If the work is already queued, doesn't it break the linked list
> >>> because of the memset in kthread_init_work?
> >>
> >> work is a local variable. It completes before vdpasim_flush_work returns,
> >> thus is never already queued on entry to vdpasim_flush_work.
> >> Am I missing your point?
> >
> > No, sorry, I was the one missing that. Thanks for explaining it :)!
> >
> > I'm not so used to the kthread queue, but why not calling
> > kthread_flush_work on vdpasim->work directly?
>
> vdpasim->work is not the only work posted to vdpasim->worker; see
> vdpasim_worker_change_mm_sync. Posting a new no-op work guarantees
> they are all flushed.
>

But it is ok to have concurrent mm updates, isn't it? Moreover, they
can be enqueued immediately after the kthread_flush_work already, as
there is no lock protecting it.

> - Steve
>
> >>>> + kthread_queue_work(vdpasim->worker, &work);
> >>>> + kthread_flush_work(&work);
> >>>> +}
> >>>> +
> >>>> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
> >>>> {
> >>>> return container_of(vdpa, struct vdpasim, vdpa);
> >>>> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
> >>>> vdpasim->running = false;
> >>>> mutex_unlock(&vdpasim->mutex);
> >>>>
> >>>> + vdpasim_flush_work(vdpasim);
> >>>
> >>> Do we need to protect the case where vdpasim_kick_vq and
> >>> vdpasim_suspend are called "at the same time"? Correct userland should
> >>> not be doing it but buggy or mailious could be. Just calling
> >>> vdpasim_flush_work with the mutex acquired would solve the issue,
> >>> doesn't it?
> >>
> >> Good catch. I need to serialize access to vdpasim->running plus the worker queue
> >> in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called
> >> from non-task contexts, I should define a new spinlock to be acquired in both functions.
> >>
> >> - Steve
> >>
> >
>


2024-02-16 15:16:42

by Steven Sistare

[permalink] [raw]
Subject: Re: [PATCH V2 3/3] vdpa_sim: flush workers on suspend

On 2/15/2024 10:44 AM, Eugenio Perez Martin wrote:
> On Wed, Feb 14, 2024 at 8:52 PM Steven Sistare
> <[email protected]> wrote:
>>
>> On 2/14/2024 2:39 PM, Eugenio Perez Martin wrote:
>>> On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare
>>> <[email protected]> wrote:
>>>>
>>>> On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote:
>>>>> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <[email protected]> wrote:
>>>>>>
>>>>>> Flush to guarantee no workers are running when suspend returns.
>>>>>>
>>>>>> Signed-off-by: Steve Sistare <[email protected]>
>>>>>> ---
>>>>>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
>>>>>> 1 file changed, 13 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>>>>> index be2925d0d283..a662b90357c3 100644
>>>>>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>>>>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>>>>>> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim,
>>>>>> kthread_flush_work(work);
>>>>>> }
>>>>>>
>>>>>> +static void flush_work_fn(struct kthread_work *work) {}
>>>>>> +
>>>>>> +static void vdpasim_flush_work(struct vdpasim *vdpasim)
>>>>>> +{
>>>>>> + struct kthread_work work;
>>>>>> +
>>>>>> + kthread_init_work(&work, flush_work_fn);
>>>>>
>>>>> If the work is already queued, doesn't it break the linked list
>>>>> because of the memset in kthread_init_work?
>>>>
>>>> work is a local variable. It completes before vdpasim_flush_work returns,
>>>> thus is never already queued on entry to vdpasim_flush_work.
>>>> Am I missing your point?
>>>
>>> No, sorry, I was the one missing that. Thanks for explaining it :)!
>>>
>>> I'm not so used to the kthread queue, but why not calling
>>> kthread_flush_work on vdpasim->work directly?
>>
>> vdpasim->work is not the only work posted to vdpasim->worker; see
>> vdpasim_worker_change_mm_sync. Posting a new no-op work guarantees
>> they are all flushed.
>
> But it is ok to have concurrent mm updates, isn't it? Moreover, they
> can be enqueued immediately after the kthread_flush_work already, as
> there is no lock protecting it.

Agreed on both, thanks. I will simplify and only flush vdpasim->work.

- Steve

>>>>>> + kthread_queue_work(vdpasim->worker, &work);
>>>>>> + kthread_flush_work(&work);
>>>>>> +}
>>>>>> +
>>>>>> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
>>>>>> {
>>>>>> return container_of(vdpa, struct vdpasim, vdpa);
>>>>>> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa)
>>>>>> vdpasim->running = false;
>>>>>> mutex_unlock(&vdpasim->mutex);
>>>>>>
>>>>>> + vdpasim_flush_work(vdpasim);
>>>>>
>>>>> Do we need to protect the case where vdpasim_kick_vq and
>>>>> vdpasim_suspend are called "at the same time"? Correct userland should
>>>>> not be doing it but buggy or mailious could be. Just calling
>>>>> vdpasim_flush_work with the mutex acquired would solve the issue,
>>>>> doesn't it?
>>>>
>>>> Good catch. I need to serialize access to vdpasim->running plus the worker queue
>>>> in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called
>>>> from non-task contexts, I should define a new spinlock to be acquired in both functions.
>>>>
>>>> - Steve
>>>>
>>>
>>
>