2023-09-20 10:51:42

by Zhu, Lingshan

[permalink] [raw]
Subject: Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg



On 9/20/2023 3:32 PM, Parav Pandit wrote:
>
>> From: Zhu, Lingshan <[email protected]>
>> Sent: Wednesday, September 20, 2023 12:58 PM
>>
>> On 9/20/2023 3:10 PM, Parav Pandit wrote:
>>>> From: Zhu, Lingshan <[email protected]>
>>>> Sent: Wednesday, September 20, 2023 12:37 PM
>>>>> The problem to overcome in [1] is, resume operation needs to be
>>>>> synchronous
>>>> as it involves large part of context to resume back, and hence just
>>>> asynchronously setting DRIVER_OK is not enough.
>>>>> The sw must verify back that device has resumed the operation and
>>>>> ready to
>>>> answer requests.
>>>> this is not live migration, all device status and other information
>>>> still stay in the device, no need to "resume" context, just resume running.
>>>>
>>> I am aware that it is not live migration. :)
>>>
>>> "Just resuming" involves lot of device setup task. The device implementation
>> does not know for how long a device is suspended.
>>> So for example, a VM is suspended for 6 hours, hence the device context
>> could be saved in a slow disk.
>>> Hence, when the resume is done, it needs to setup things again and driver got
>> to verify before accessing more from the device.
>> The restore procedures should perform by the hypervisor and done before set
>> DRIVER_OK and wake up the guest.
> Which is the signal to trigger the restore? Which is the trigger in physical device when there is no hypervisor?
>
> In my view, setting the DRIVER_OK is the signal regardless of hypervisor or physical device.
> Hence the re-read is must.
Yes, as I said below, should verify by re-read.
>
>> And the hypervisor/driver needs to check the device status by re-reading.
>>>> Like resume from a failed LM.
>>>>> This is slightly different flow than setting the DRIVER_OK for the
>>>>> first time
>>>> device initialization sequence as it does not involve large restoration.
>>>>> So, to merge two ideas, instead of doing DRIVER_OK to resume, the
>>>>> driver
>>>> should clear the SUSPEND bit and verify that it is out of SUSPEND.
>>>>> Because driver is still in _OK_ driving the device flipping the SUSPEND bit.
>>>> Please read the spec, it says:
>>>> The driver MUST NOT clear a device status bit
>>>>
>>> Yes, this is why either DRIER_OK validation by the driver is needed or Jiqian's
>> synchronous new register..
>> so re-read
> Yes. re-read until set, Thanks.
>


2023-09-20 12:00:55

by Parav Pandit

[permalink] [raw]
Subject: RE: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg


> From: Zhu, Lingshan <[email protected]>
> Sent: Wednesday, September 20, 2023 1:16 PM

[..]
> > In my view, setting the DRIVER_OK is the signal regardless of hypervisor or
> physical device.
> > Hence the re-read is must.
> Yes, as I said below, should verify by re-read.
> >
Thanks.