2022-02-02 18:24:58

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver

On Wed, Feb 02, 2022 at 02:34:52PM +0000, Shameerali Kolothum Thodi wrote:

> > There are few topics to consider:
> > - Which of the three feature sets (STOP_COPY, P2P and PRECOPY) make
> > sense for this driver?
>
> I think it will be STOP_COPY only for now. We might have PRECOPY feature once
> we have the SMMUv3 HTTU support in future.

HTTU is the dirty tracking feature? To be clear VFIO migration support
for PRECOPY has nothing to do with IOMMU based dirty page tracking.

> > - I think we discussed the P2P implementation and decided it would
> > work for this device? Can you re-read and confirm?
>
> In our case these devices are Integrated End Point devices and doesn't have
> P2P DMA capability. Hence the FSM arcs will be limited to STOP_COPY feature
> I guess. Also, since we cannot guarantee a NDMA state in STOP, my
> assumption currently is the onus of making sure that no MMIO access happens
> in STOP is on the user. Is that a valid assumption?

Yes, you can treat RUNNING_P2P as the same as STOP and rely on no MMIO
access to sustain it.

(and I'm wondering sometimes if we should rename RUNNING_P2P to
STOP_P2P - ie the device is stopped but still allows inbound P2P to
make this clearer)

> Do we need to set the below before the feature query?
> Or am I using a wrong Qemu/kernel repo?
>
> +++ b/hw/vfio/migration.c
> @@ -488,6 +488,7 @@ static int vfio_migration_query_flags(VFIODevice
> *vbasedev, uint64_t *mig_flags)
> struct vfio_device_feature_migration *mig = (void *)feature->data;
>
> feature->argsz = sizeof(buf);
> + feature->flags = VFIO_DEVICE_FEATURE_MIGRATION | VFIO_DEVICE_FEATURE_GET;
> if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature) != 0)
> return -EOPNOTSUPP;

Oh, this is my mistake I thought this got pushed to that github
already but didn't, I updated it.

If you have a prototype can you post another RFC?

Thanks,
Jason


Subject: RE: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver



> -----Original Message-----
> From: Jason Gunthorpe [mailto:[email protected]]
> Sent: 02 February 2022 15:40
> To: Shameerali Kolothum Thodi <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; Linuxarm <[email protected]>; liulongfang
> <[email protected]>; Zengtao (B) <[email protected]>;
> yuzenghui <[email protected]>; Jonathan Cameron
> <[email protected]>; Wangzhou (B) <[email protected]>
> Subject: Re: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver
>
> On Wed, Feb 02, 2022 at 02:34:52PM +0000, Shameerali Kolothum Thodi
> wrote:
>
> > > There are few topics to consider:
> > > - Which of the three feature sets (STOP_COPY, P2P and PRECOPY) make
> > > sense for this driver?
> >
> > I think it will be STOP_COPY only for now. We might have PRECOPY
> > feature once we have the SMMUv3 HTTU support in future.
>
> HTTU is the dirty tracking feature? To be clear VFIO migration support for
> PRECOPY has nothing to do with IOMMU based dirty page tracking.

Yes, it is based on the IOMMU hardware dirty bit management support.
A RFC was posted sometime back,
https://lore.kernel.org/kvm/[email protected]/

Ok, my guess was that the PRECOPY here was related. Thanks for clarifying.

>
> > > - I think we discussed the P2P implementation and decided it would
> > > work for this device? Can you re-read and confirm?
> >
> > In our case these devices are Integrated End Point devices and doesn't
> > have P2P DMA capability. Hence the FSM arcs will be limited to
> > STOP_COPY feature I guess. Also, since we cannot guarantee a NDMA
> > state in STOP, my assumption currently is the onus of making sure that
> > no MMIO access happens in STOP is on the user. Is that a valid assumption?
>
> Yes, you can treat RUNNING_P2P as the same as STOP and rely on no MMIO
> access to sustain it.

Ok.

> (and I'm wondering sometimes if we should rename RUNNING_P2P to
> STOP_P2P - ie the device is stopped but still allows inbound P2P to make this
> clearer)
>
> > Do we need to set the below before the feature query?
> > Or am I using a wrong Qemu/kernel repo?
> >
> > +++ b/hw/vfio/migration.c
> > @@ -488,6 +488,7 @@ static int vfio_migration_query_flags(VFIODevice
> > *vbasedev, uint64_t *mig_flags)
> > struct vfio_device_feature_migration *mig = (void
> > *)feature->data;
> >
> > feature->argsz = sizeof(buf);
> > + feature->flags = VFIO_DEVICE_FEATURE_MIGRATION |
> > + VFIO_DEVICE_FEATURE_GET;
> > if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature) != 0)
> > return -EOPNOTSUPP;
>
> Oh, this is my mistake I thought this got pushed to that github already but
> didn't, I updated it.

Ok. Thanks.

> If you have a prototype can you post another RFC?

Sure, will do. I just started and has only a skeleton proto based on v2 now.
Will send out a RFC soon once I have all the FSM arcs implemented
and sanity tested.

Thanks,
Shameer