This operation is optional: It it's not implemented, backend feature bit
will not be exposed.
Signed-off-by: Eugenio Pérez <[email protected]>
---
include/linux/vdpa.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 15af802d41c4..ddfebc4e1e01 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -215,6 +215,11 @@ struct vdpa_map_file {
* @reset: Reset device
* @vdev: vdpa device
* Returns integer: success (0) or error (< 0)
+ * @stop: Stop or resume the device (optional, but it must
+ * be implemented if require device stop)
+ * @vdev: vdpa device
+ * @stop: stop (true), not stop (false)
+ * Returns integer: success (0) or error (< 0)
* @get_config_size: Get the size of the configuration space includes
* fields that are conditional on feature bits.
* @vdev: vdpa device
@@ -316,6 +321,7 @@ struct vdpa_config_ops {
u8 (*get_status)(struct vdpa_device *vdev);
void (*set_status)(struct vdpa_device *vdev, u8 status);
int (*reset)(struct vdpa_device *vdev);
+ int (*stop)(struct vdpa_device *vdev, bool stop);
size_t (*get_config_size)(struct vdpa_device *vdev);
void (*get_config)(struct vdpa_device *vdev, unsigned int offset,
void *buf, unsigned int len);
--
2.27.0
On 5/23/2022 4:54 PM, Si-Wei Liu wrote:
>
>
> On 5/23/2022 12:20 PM, Eugenio Perez Martin wrote:
>> On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <[email protected]>
>> wrote:
>>>
>>>
>>> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
>>>> This operation is optional: It it's not implemented, backend
>>>> feature bit
>>>> will not be exposed.
>>>>
>>>> Signed-off-by: Eugenio Pérez <[email protected]>
>>>> ---
>>>> include/linux/vdpa.h | 6 ++++++
>>>> 1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>>>> index 15af802d41c4..ddfebc4e1e01 100644
>>>> --- a/include/linux/vdpa.h
>>>> +++ b/include/linux/vdpa.h
>>>> @@ -215,6 +215,11 @@ struct vdpa_map_file {
>>>> * @reset: Reset device
>>>> * @vdev: vdpa device
>>>> * Returns integer: success (0) or
>>>> error (< 0)
>>>> + * @stop: Stop or resume the device (optional,
>>>> but it must
>>>> + * be implemented if require device stop)
>>>> + * @vdev: vdpa device
>>>> + * @stop: stop (true), not stop (false)
>>>> + * Returns integer: success (0) or error
>>>> (< 0)
>>> Is this uAPI meant to address all use cases described in the full blown
>>> _F_STOP virtio spec proposal, such as:
>>>
>>> --------------%<--------------
>>>
>>> ...... the device MUST finish any in flight
>>> operations after the driver writes STOP. Depending on the device, it
>>> can do it
>>> in many ways as long as the driver can recover its normal operation
>>> if it
>>> resumes the device without the need of resetting it:
>>>
>>> - Drain and wait for the completion of all pending requests until a
>>> convenient avail descriptor. Ignore any other posterior descriptor.
>>> - Return a device-specific failure for these descriptors, so the driver
>>> can choose to retry or to cancel them.
>>> - Mark them as done even if they are not, if the kind of device can
>>> assume to lose them.
>>> --------------%<--------------
>>>
>> Right, this is totally underspecified in this series.
>>
>> I'll expand on it in the next version, but that text proposed to
>> virtio-comment was complicated and misleading. I find better to get
>> the previous version description. Would the next description work?
>>
>> ```
>> After the return of ioctl, the device MUST finish any pending
>> operations like
>> in flight requests. It must also preserve all the necessary state (the
>> virtqueue vring base plus the possible device specific states)
> Hmmm, "possible device specific states" is a bit vague. Does it
> require the device to save any device internal state that is not
> defined in the virtio spec - such as any failed in-flight requests to
> resubmit upon resume? Or you would lean on SVQ to intercept it in
> depth and save it with some other means? I think network device also
> has internal state such as flow steering state that needs bookkeeping
> as well.
Noted that I understand you may introduce additional feature call
similar to VHOST_USER_GET_INFLIGHT_FD for (failed) in-flight request,
but since that's is a get interface, I assume the actual state
preserving should still take place in this STOP call.
-Siwei
>
> A follow-up question is what is the use of the `stop` argument of
> false, does it require the device to support resume? I seem to recall
> this is something to abandon in favor of device reset plus setting
> queue base/addr after. Or it's just a optional feature that may be
> device specific (if one can do so in simple way).
>
> -Siwei
>
>> that is required
>> for restoring in the future.
>>
>> In the future, we will provide features similar to
>> VHOST_USER_GET_INFLIGHT_FD
>> so the device can save pending operations.
>> ```
>>
>> Thanks for pointing it out!
>>
>>
>>
>>
>>
>>> E.g. do I assume correctly all in flight requests are flushed after
>>> return from this uAPI call? Or some of pending requests may be subject
>>> to loss or failure? How does the caller/user specify these various
>>> options (if there are) for device stop?
>>>
>>> BTW, it would be nice to add the corresponding support to vdpa_sim_blk
>>> as well to demo the stop handling. To just show it on vdpa-sim-net IMHO
>>> is perhaps not so convincing.
>>>
>>> -Siwei
>>>
>>>> * @get_config_size: Get the size of the configuration space
>>>> includes
>>>> * fields that are conditional on
>>>> feature bits.
>>>> * @vdev: vdpa device
>>>> @@ -316,6 +321,7 @@ struct vdpa_config_ops {
>>>> u8 (*get_status)(struct vdpa_device *vdev);
>>>> void (*set_status)(struct vdpa_device *vdev, u8 status);
>>>> int (*reset)(struct vdpa_device *vdev);
>>>> + int (*stop)(struct vdpa_device *vdev, bool stop);
>>>> size_t (*get_config_size)(struct vdpa_device *vdev);
>>>> void (*get_config)(struct vdpa_device *vdev, unsigned int
>>>> offset,
>>>> void *buf, unsigned int len);
>
在 2022/5/24 08:01, Si-Wei Liu 写道:
>
>
> On 5/23/2022 4:54 PM, Si-Wei Liu wrote:
>>
>>
>> On 5/23/2022 12:20 PM, Eugenio Perez Martin wrote:
>>> On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <[email protected]>
>>> wrote:
>>>>
>>>>
>>>> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
>>>>> This operation is optional: It it's not implemented, backend
>>>>> feature bit
>>>>> will not be exposed.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <[email protected]>
>>>>> ---
>>>>> include/linux/vdpa.h | 6 ++++++
>>>>> 1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>>>>> index 15af802d41c4..ddfebc4e1e01 100644
>>>>> --- a/include/linux/vdpa.h
>>>>> +++ b/include/linux/vdpa.h
>>>>> @@ -215,6 +215,11 @@ struct vdpa_map_file {
>>>>> * @reset: Reset device
>>>>> * @vdev: vdpa device
>>>>> * Returns integer: success (0) or
>>>>> error (< 0)
>>>>> + * @stop: Stop or resume the device (optional,
>>>>> but it must
>>>>> + * be implemented if require device stop)
>>>>> + * @vdev: vdpa device
>>>>> + * @stop: stop (true), not stop (false)
>>>>> + * Returns integer: success (0) or
>>>>> error (< 0)
>>>> Is this uAPI meant to address all use cases described in the full
>>>> blown
>>>> _F_STOP virtio spec proposal, such as:
>>>>
>>>> --------------%<--------------
>>>>
>>>> ...... the device MUST finish any in flight
>>>> operations after the driver writes STOP. Depending on the device, it
>>>> can do it
>>>> in many ways as long as the driver can recover its normal operation
>>>> if it
>>>> resumes the device without the need of resetting it:
>>>>
>>>> - Drain and wait for the completion of all pending requests until a
>>>> convenient avail descriptor. Ignore any other posterior
>>>> descriptor.
>>>> - Return a device-specific failure for these descriptors, so the
>>>> driver
>>>> can choose to retry or to cancel them.
>>>> - Mark them as done even if they are not, if the kind of device can
>>>> assume to lose them.
>>>> --------------%<--------------
>>>>
>>> Right, this is totally underspecified in this series.
>>>
>>> I'll expand on it in the next version, but that text proposed to
>>> virtio-comment was complicated and misleading. I find better to get
>>> the previous version description. Would the next description work?
>>>
>>> ```
>>> After the return of ioctl, the device MUST finish any pending
>>> operations like
>>> in flight requests. It must also preserve all the necessary state (the
>>> virtqueue vring base plus the possible device specific states)
>> Hmmm, "possible device specific states" is a bit vague. Does it
>> require the device to save any device internal state that is not
>> defined in the virtio spec - such as any failed in-flight requests to
>> resubmit upon resume? Or you would lean on SVQ to intercept it in
>> depth and save it with some other means? I think network device also
>> has internal state such as flow steering state that needs bookkeeping
>> as well.
> Noted that I understand you may introduce additional feature call
> similar to VHOST_USER_GET_INFLIGHT_FD for (failed) in-flight request,
> but since that's is a get interface, I assume the actual state
> preserving should still take place in this STOP call.
>
Yes, I think so.
> -Siwei
>
>>
>> A follow-up question is what is the use of the `stop` argument of
>> false, does it require the device to support resume?
Yes, this is required by the hypervisor e.g for Qemu it supports vm
stop/resume.
>> I seem to recall this is something to abandon in favor of device
>> reset plus setting queue base/addr after. Or it's just a optional
>> feature that may be device specific (if one can do so in simple way).
Rest is more like a workarond consider we don't have a stop API.
Consider we don't add stop at the beginning, it can only be an optional
feature.
Thanks
>>
>> -Siwei
>>
>>> that is required
>>> for restoring in the future.
>>>
>>> In the future, we will provide features similar to
>>> VHOST_USER_GET_INFLIGHT_FD
>>> so the device can save pending operations.
>>> ```
>>>
>>> Thanks for pointing it out!
>>>
>>>
>>>
>>>
>>>
>>>> E.g. do I assume correctly all in flight requests are flushed after
>>>> return from this uAPI call? Or some of pending requests may be subject
>>>> to loss or failure? How does the caller/user specify these various
>>>> options (if there are) for device stop?
>>>>
>>>> BTW, it would be nice to add the corresponding support to vdpa_sim_blk
>>>> as well to demo the stop handling. To just show it on vdpa-sim-net
>>>> IMHO
>>>> is perhaps not so convincing.
>>>>
>>>> -Siwei
>>>>
>>>>> * @get_config_size: Get the size of the configuration space
>>>>> includes
>>>>> * fields that are conditional on
>>>>> feature bits.
>>>>> * @vdev: vdpa device
>>>>> @@ -316,6 +321,7 @@ struct vdpa_config_ops {
>>>>> u8 (*get_status)(struct vdpa_device *vdev);
>>>>> void (*set_status)(struct vdpa_device *vdev, u8 status);
>>>>> int (*reset)(struct vdpa_device *vdev);
>>>>> + int (*stop)(struct vdpa_device *vdev, bool stop);
>>>>> size_t (*get_config_size)(struct vdpa_device *vdev);
>>>>> void (*get_config)(struct vdpa_device *vdev, unsigned int
>>>>> offset,
>>>>> void *buf, unsigned int len);
>>
>
On Tue, May 24, 2022 at 9:09 AM Stefano Garzarella <[email protected]> wrote:
>
> On Mon, May 23, 2022 at 09:20:14PM +0200, Eugenio Perez Martin wrote:
> >On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <[email protected]> wrote:
> >>
> >>
> >>
> >> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
> >> > This operation is optional: It it's not implemented, backend feature bit
> >> > will not be exposed.
> >> >
> >> > Signed-off-by: Eugenio Pérez <[email protected]>
> >> > ---
> >> > include/linux/vdpa.h | 6 ++++++
> >> > 1 file changed, 6 insertions(+)
> >> >
> >> > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> >> > index 15af802d41c4..ddfebc4e1e01 100644
> >> > --- a/include/linux/vdpa.h
> >> > +++ b/include/linux/vdpa.h
> >> > @@ -215,6 +215,11 @@ struct vdpa_map_file {
> >> > * @reset: Reset device
> >> > * @vdev: vdpa device
> >> > * Returns integer: success (0) or error (< 0)
> >> > + * @stop: Stop or resume the device (optional, but it must
> >> > + * be implemented if require device stop)
> >> > + * @vdev: vdpa device
> >> > + * @stop: stop (true), not stop (false)
> >> > + * Returns integer: success (0) or error (< 0)
> >> Is this uAPI meant to address all use cases described in the full blown
> >> _F_STOP virtio spec proposal, such as:
> >>
> >> --------------%<--------------
> >>
> >> ...... the device MUST finish any in flight
> >> operations after the driver writes STOP. Depending on the device, it
> >> can do it
> >> in many ways as long as the driver can recover its normal operation
> >> if it
> >> resumes the device without the need of resetting it:
> >>
> >> - Drain and wait for the completion of all pending requests until a
> >> convenient avail descriptor. Ignore any other posterior descriptor.
> >> - Return a device-specific failure for these descriptors, so the driver
> >> can choose to retry or to cancel them.
> >> - Mark them as done even if they are not, if the kind of device can
> >> assume to lose them.
> >> --------------%<--------------
> >>
> >
> >Right, this is totally underspecified in this series.
> >
> >I'll expand on it in the next version, but that text proposed to
> >virtio-comment was complicated and misleading. I find better to get
> >the previous version description. Would the next description work?
> >
> >```
> >After the return of ioctl, the device MUST finish any pending operations like
> >in flight requests. It must also preserve all the necessary state (the
> >virtqueue vring base plus the possible device specific states) that is required
> >for restoring in the future.
>
> For block devices wait for all in-flight requests could take several
> time.
>
> Could this be a problem if the caller gets stuck on this ioctl?
>
> If it could be a problem, maybe we should use an eventfd to signal that
> the device is successfully stopped.
>
For that particular problem I'd very much prefer to add directly an
ioctl to get the inflight descriptors. We know for sure we will need
them, and it will be cleaner in the long run.
As I understand the vdpa block simulator, there is no need to return
the inflight descriptors since all of the requests are processed in a
synchronous way. So, for this iteration, we could offer the stop
feature to qemu.
Other non-simulated devices would need it. Could it be delayed to
future development?
Thanks!
> Thanks,
> Stefano
>
On Tue, May 24, 2022 at 09:42:06AM +0200, Eugenio Perez Martin wrote:
>On Tue, May 24, 2022 at 9:09 AM Stefano Garzarella <[email protected]> wrote:
>>
>> On Mon, May 23, 2022 at 09:20:14PM +0200, Eugenio Perez Martin wrote:
>> >On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <[email protected]> wrote:
>> >>
>> >>
>> >>
>> >> On 5/20/2022 10:23 AM, Eugenio P?rez wrote:
>> >> > This operation is optional: It it's not implemented, backend feature bit
>> >> > will not be exposed.
>> >> >
>> >> > Signed-off-by: Eugenio P?rez <[email protected]>
>> >> > ---
>> >> > include/linux/vdpa.h | 6 ++++++
>> >> > 1 file changed, 6 insertions(+)
>> >> >
>> >> > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>> >> > index 15af802d41c4..ddfebc4e1e01 100644
>> >> > --- a/include/linux/vdpa.h
>> >> > +++ b/include/linux/vdpa.h
>> >> > @@ -215,6 +215,11 @@ struct vdpa_map_file {
>> >> > * @reset: Reset device
>> >> > * @vdev: vdpa device
>> >> > * Returns integer: success (0) or error (< 0)
>> >> > + * @stop: Stop or resume the device (optional, but it must
>> >> > + * be implemented if require device stop)
>> >> > + * @vdev: vdpa device
>> >> > + * @stop: stop (true), not stop (false)
>> >> > + * Returns integer: success (0) or error (< 0)
>> >> Is this uAPI meant to address all use cases described in the full blown
>> >> _F_STOP virtio spec proposal, such as:
>> >>
>> >> --------------%<--------------
>> >>
>> >> ...... the device MUST finish any in flight
>> >> operations after the driver writes STOP. Depending on the device, it
>> >> can do it
>> >> in many ways as long as the driver can recover its normal operation
>> >> if it
>> >> resumes the device without the need of resetting it:
>> >>
>> >> - Drain and wait for the completion of all pending requests until a
>> >> convenient avail descriptor. Ignore any other posterior descriptor.
>> >> - Return a device-specific failure for these descriptors, so the driver
>> >> can choose to retry or to cancel them.
>> >> - Mark them as done even if they are not, if the kind of device can
>> >> assume to lose them.
>> >> --------------%<--------------
>> >>
>> >
>> >Right, this is totally underspecified in this series.
>> >
>> >I'll expand on it in the next version, but that text proposed to
>> >virtio-comment was complicated and misleading. I find better to get
>> >the previous version description. Would the next description work?
>> >
>> >```
>> >After the return of ioctl, the device MUST finish any pending operations like
>> >in flight requests. It must also preserve all the necessary state (the
>> >virtqueue vring base plus the possible device specific states) that is required
>> >for restoring in the future.
>>
>> For block devices wait for all in-flight requests could take several
>> time.
>>
>> Could this be a problem if the caller gets stuck on this ioctl?
>>
>> If it could be a problem, maybe we should use an eventfd to signal that
>> the device is successfully stopped.
>>
>
>For that particular problem I'd very much prefer to add directly an
>ioctl to get the inflight descriptors. We know for sure we will need
>them, and it will be cleaner in the long run.
Makes sense!
>
>As I understand the vdpa block simulator, there is no need to return
>the inflight descriptors since all of the requests are processed in a
>synchronous way. So, for this iteration, we could offer the stop
>feature to qemu.
Right, the simulator handles everything synchronously.
>
>Other non-simulated devices would need it. Could it be delayed to
>future development?
Yep, sure, it sounds like you already have a plan, so no problem :-)
Thanks,
Stefano
On Tue, May 24, 2022 at 2:01 AM Si-Wei Liu <[email protected]> wrote:
>
>
>
> On 5/23/2022 4:54 PM, Si-Wei Liu wrote:
> >
> >
> > On 5/23/2022 12:20 PM, Eugenio Perez Martin wrote:
> >> On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <[email protected]>
> >> wrote:
> >>>
> >>>
> >>> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
> >>>> This operation is optional: It it's not implemented, backend
> >>>> feature bit
> >>>> will not be exposed.
> >>>>
> >>>> Signed-off-by: Eugenio Pérez <[email protected]>
> >>>> ---
> >>>> include/linux/vdpa.h | 6 ++++++
> >>>> 1 file changed, 6 insertions(+)
> >>>>
> >>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> >>>> index 15af802d41c4..ddfebc4e1e01 100644
> >>>> --- a/include/linux/vdpa.h
> >>>> +++ b/include/linux/vdpa.h
> >>>> @@ -215,6 +215,11 @@ struct vdpa_map_file {
> >>>> * @reset: Reset device
> >>>> * @vdev: vdpa device
> >>>> * Returns integer: success (0) or
> >>>> error (< 0)
> >>>> + * @stop: Stop or resume the device (optional,
> >>>> but it must
> >>>> + * be implemented if require device stop)
> >>>> + * @vdev: vdpa device
> >>>> + * @stop: stop (true), not stop (false)
> >>>> + * Returns integer: success (0) or error
> >>>> (< 0)
> >>> Is this uAPI meant to address all use cases described in the full blown
> >>> _F_STOP virtio spec proposal, such as:
> >>>
> >>> --------------%<--------------
> >>>
> >>> ...... the device MUST finish any in flight
> >>> operations after the driver writes STOP. Depending on the device, it
> >>> can do it
> >>> in many ways as long as the driver can recover its normal operation
> >>> if it
> >>> resumes the device without the need of resetting it:
> >>>
> >>> - Drain and wait for the completion of all pending requests until a
> >>> convenient avail descriptor. Ignore any other posterior descriptor.
> >>> - Return a device-specific failure for these descriptors, so the driver
> >>> can choose to retry or to cancel them.
> >>> - Mark them as done even if they are not, if the kind of device can
> >>> assume to lose them.
> >>> --------------%<--------------
> >>>
> >> Right, this is totally underspecified in this series.
> >>
> >> I'll expand on it in the next version, but that text proposed to
> >> virtio-comment was complicated and misleading. I find better to get
> >> the previous version description. Would the next description work?
> >>
> >> ```
> >> After the return of ioctl, the device MUST finish any pending
> >> operations like
> >> in flight requests. It must also preserve all the necessary state (the
> >> virtqueue vring base plus the possible device specific states)
> > Hmmm, "possible device specific states" is a bit vague. Does it
> > require the device to save any device internal state that is not
> > defined in the virtio spec - such as any failed in-flight requests to
> > resubmit upon resume?
I'd let that be device-specific. For example, the net simulator
doesn't need to store them, since it cannot stop while processing
buffers. Other net devices can also decide to simply drop or re-submit
tx frames.
I can check for the block simulator if that's possible too. For
hardware vdpa block devices, this should be combined with the future
"get inflight buffers" call for sure.
> > Or you would lean on SVQ to intercept it in
> > depth and save it with some other means? I think network device also
> > has internal state such as flow steering state that needs bookkeeping
> > as well.
Yes, for state set by the control vq a permanent SVQ is used only for
the cvq. For other things like config space vdpa already presents an
emulated one to the guest, so we're safe in that regard.
> Noted that I understand you may introduce additional feature call
> similar to VHOST_USER_GET_INFLIGHT_FD for (failed) in-flight request,
> but since that's is a get interface, I assume the actual state
> preserving should still take place in this STOP call.
>
Right. I'll add all of this to the proposal.
Thanks!
> -Siwei
>
> >
> > A follow-up question is what is the use of the `stop` argument of
> > false, does it require the device to support resume? I seem to recall
> > this is something to abandon in favor of device reset plus setting
> > queue base/addr after. Or it's just a optional feature that may be
> > device specific (if one can do so in simple way).
> >
> > -Siwei
> >
> >> that is required
> >> for restoring in the future.
> >>
> >> In the future, we will provide features similar to
> >> VHOST_USER_GET_INFLIGHT_FD
> >> so the device can save pending operations.
> >> ```
> >>
> >> Thanks for pointing it out!
> >>
> >>
> >>
> >>
> >>
> >>> E.g. do I assume correctly all in flight requests are flushed after
> >>> return from this uAPI call? Or some of pending requests may be subject
> >>> to loss or failure? How does the caller/user specify these various
> >>> options (if there are) for device stop?
> >>>
> >>> BTW, it would be nice to add the corresponding support to vdpa_sim_blk
> >>> as well to demo the stop handling. To just show it on vdpa-sim-net IMHO
> >>> is perhaps not so convincing.
> >>>
> >>> -Siwei
> >>>
> >>>> * @get_config_size: Get the size of the configuration space
> >>>> includes
> >>>> * fields that are conditional on
> >>>> feature bits.
> >>>> * @vdev: vdpa device
> >>>> @@ -316,6 +321,7 @@ struct vdpa_config_ops {
> >>>> u8 (*get_status)(struct vdpa_device *vdev);
> >>>> void (*set_status)(struct vdpa_device *vdev, u8 status);
> >>>> int (*reset)(struct vdpa_device *vdev);
> >>>> + int (*stop)(struct vdpa_device *vdev, bool stop);
> >>>> size_t (*get_config_size)(struct vdpa_device *vdev);
> >>>> void (*get_config)(struct vdpa_device *vdev, unsigned int
> >>>> offset,
> >>>> void *buf, unsigned int len);
> >
>