Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp1646787pxb; Wed, 2 Feb 2022 09:20:53 -0800 (PST) X-Google-Smtp-Source: ABdhPJz6Vya5vF4gjEbWxMWAuElSdXSHCwE2WFDLO2TonTN2h7RnJ3wxj9uU64YYIvjdG+vSa6Zj X-Received: by 2002:a17:902:a9c2:: with SMTP id b2mr31694023plr.168.1643822453230; Wed, 02 Feb 2022 09:20:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643822453; cv=none; d=google.com; s=arc-20160816; b=mrLqGPWg9nSs1sUi1SU29nxEnJyozIX12Hkcjz9rnqyixZ0ebYtTMD22j1uEizRzYL bmGQ+fgJRk4uKVtdnsqBR22S6vp8MYM1z3dDYts/ko3geK45V0O+QpyxbmspKJneO8wL jtFxGep+SpnONJ0lwbku4akLwqzFnjMs8XvWOGl4vaexSFzv8dhee7Bo/BSsKHRHvOCD 0MF1rfy6P2cK61cx2EgH9prCq8BMNNb4oW/+W6NUDsjrhANv3K2HZYoP+C0O6UrBvOxs oiRFx0bmSnfX+mHE+W6FgQX018G8ycTZ0ZdYeRXg9GzLg6v9Mj+qyJ0xjTr7VXVDr7P7 9sfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from; bh=R2dD/oO2U56TsL0aDkNGDOr1Tg6ifXhj47PmARS+j5U=; b=aWNDfpK3dCIvRcxzIuNMXZnnUv7wKRyY6uEncC4a4+Ts5rH6wsQoTt0331AVYE7Lq5 sx0SveLnB/ZtdrBk3FfUy8x/Vj1bB8FOkzCjZOFXcXR6Kfxkg/49JI4GH8WkprbCA2BY Spm5LlNcQ4sFL/gWBHh99ui0LR3EiQRmxv+GvelPveYStsKc1a+BXxCpluHUPkOwpPDQ LsPiPGlC38GRRXIxE6iYuYRqWAEintm57wTOO9/CBNIQzvB0MrmbxIAPPbZ1lhxlh2p/ cWoO2nv+NWoPBnK3+xhhJDopuMJFv1YnX+fH8qtN3iIMugeoBig/uEPw1eYN8zgBpAF5 Mnww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m15si19164394pfc.261.2022.02.02.09.20.41; Wed, 02 Feb 2022 09:20:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345069AbiBBOe7 convert rfc822-to-8bit (ORCPT + 99 others); Wed, 2 Feb 2022 09:34:59 -0500 Received: from frasgout.his.huawei.com ([185.176.79.56]:4661 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234062AbiBBOez (ORCPT ); Wed, 2 Feb 2022 09:34:55 -0500 Received: from fraeml714-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4Jpkp33h5Xz67cpv; Wed, 2 Feb 2022 22:34:19 +0800 (CST) Received: from lhreml713-chm.china.huawei.com (10.201.108.64) by fraeml714-chm.china.huawei.com (10.206.15.33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Wed, 2 Feb 2022 15:34:52 +0100 Received: from lhreml710-chm.china.huawei.com (10.201.108.61) by lhreml713-chm.china.huawei.com (10.201.108.64) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Wed, 2 Feb 2022 14:34:52 +0000 Received: from lhreml710-chm.china.huawei.com ([169.254.81.184]) by lhreml710-chm.china.huawei.com ([169.254.81.184]) with mapi id 15.01.2308.021; Wed, 2 Feb 2022 14:34:52 +0000 From: Shameerali Kolothum Thodi To: Jason Gunthorpe CC: "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-crypto@vger.kernel.org" , "alex.williamson@redhat.com" , "mgurtovoy@nvidia.com" , Linuxarm , liulongfang , "Zengtao (B)" , yuzenghui , "Jonathan Cameron" , "Wangzhou (B)" Subject: RE: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver Thread-Topic: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver Thread-Index: AQHYGDbmFHRHIXj6i0KnDwR2h2vbX6yAR7ag Date: Wed, 2 Feb 2022 14:34:52 +0000 Message-ID: References: <20210702095849.1610-1-shameerali.kolothum.thodi@huawei.com> <20220202131448.GA2538420@nvidia.com> In-Reply-To: <20220202131448.GA2538420@nvidia.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.202.227.178] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > -----Original Message----- > From: Jason Gunthorpe [mailto:jgg@nvidia.com] > Sent: 02 February 2022 13:15 > To: Shameerali Kolothum Thodi > Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org; > linux-crypto@vger.kernel.org; alex.williamson@redhat.com; > mgurtovoy@nvidia.com; Linuxarm ; liulongfang > ; Zengtao (B) ; > yuzenghui ; Jonathan Cameron > ; Wangzhou (B) > Subject: Re: [RFC v2 0/4] vfio/hisilicon: add acc live migration driver > > On Fri, Jul 02, 2021 at 10:58:45AM +0100, Shameer Kolothum wrote: > > This series attempts to add vfio live migration support for > > HiSilicon ACC VF devices. HiSilicon ACC VF device MMIO space > > includes both the functional register space and migration > > control register space. As discussed in RFCv1[0], this may create > > security issues as these regions get shared between the Guest > > driver and the migration driver. Based on the feedback, we tried > > to address those concerns in this version. > > > > This is now based on the new vfio-pci-core framework proposal[1]. > > Understand that the framework proposal is still under discussion, > > but really appreciate any feedback on the approach taken here > > to mitigate the security risks. > > Hi, can you look at the v6 proposal for the mlx5 implementation of the > migration API and see if it meets hisilicon acc's needs as well? > > https://lore.kernel.org/all/20220130160826.32449-1-yishaih@nvidia.com/ Yes, I saw that one. Thanks for that and is now looking into it. > > There are few topics to consider: > - Which of the three feature sets (STOP_COPY, P2P and PRECOPY) make > sense for this driver? I think it will be STOP_COPY only for now. We might have PRECOPY feature once we have the SMMUv3 HTTU support in future. > > I see pf_qm_state_pre_save() but didn't understand why it wanted to > send the first 32 bytes in the PRECOPY mode? It is fine, but it > will add some complexity to continue to do this. That was mainly to do a quick verification between src and dst compatibility before we start saving the state. I think probably we can delay that check for later. > - I think we discussed the P2P implementation and decided it would > work for this device? Can you re-read and confirm? In our case these devices are Integrated End Point devices and doesn't have P2P DMA capability. Hence the FSM arcs will be limited to STOP_COPY feature I guess. Also, since we cannot guarantee a NDMA state in STOP, my assumption currently is the onus of making sure that no MMIO access happens in STOP is on the user. Is that a valid assumption? > - Are the arcs we defined going to work here as well? The current > implementation in hisi_acc_vf_set_device_state() is very far away > from what the v1 protocol is, so I'm having a hard time guessing, > but.. Right. The FSM has changed a couple of times since we posted this. I am going to rebase all that now. > RESUMING -> STOP > Probably vf_qm_state_resume() > > RUNNING -> STOP > vf_qm_fun_restart() - that is oddly named.. > > STOP -> RESUMING > Seems to be a nop (likely a bug) > > STOP -> RUNNING > Not implemented currenty? (also a bug) > > STOP -> STOP_COPY > pf_qm_state_pre_save / vf_qm_state_save > > STOP_COPY -> STOP > NOP I will check and verify this. > And the modification for the P2P/NO DMA is presumably just > fun_restart too since stopping the device and stopping DMA are > going to be the same thing here? Yes, in our case stopping device and stopping DMA are effectively the same thing. > > The mlx5 implementation linked above is a full example you can cut and > paste from for how to implement the state function and the how to do > the data transfer. The f_ops read/write implementation for acc looks > trivial as it only streams the fixed size and pre-allocated 'struct > acc_vf_data' > > It looks like it would be a short path to implement our v2 proposal > and remove a lot of driver code, as we saw in mlx5. > Ok. These are the git repo I am using for the rework, https://github.com/jgunthorpe/qemu/commits/vfio_migration_v2 https://github.com/jgunthorpe/linux/tree/vfio_migration_v2 Please let me know if the above are not up to date. Also, just noted that my quick prototype is now failing with below error, " Error: VFIO device doesn't support migration" Do we need to set the below before the feature query? Or am I using a wrong Qemu/kernel repo? --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -488,6 +488,7 @@ static int vfio_migration_query_flags(VFIODevice *vbasedev, uint64_t *mig_flags) struct vfio_device_feature_migration *mig = (void *)feature->data; feature->argsz = sizeof(buf); + feature->flags = VFIO_DEVICE_FEATURE_MIGRATION | VFIO_DEVICE_FEATURE_GET; if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature) != 0) return -EOPNOTSUPP; Thanks, Shameer