Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp298987yba; Wed, 24 Apr 2019 01:07:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqxNYNWQFnroSXz4+vRoeQydHp174n+TMhV7UGJ+NVjLN/C7XIW9Vfyj9snpJ7HwcHJXGvQi X-Received: by 2002:a17:902:70c6:: with SMTP id l6mr30263882plt.95.1556093258018; Wed, 24 Apr 2019 01:07:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556093258; cv=none; d=google.com; s=arc-20160816; b=H2LHBd9NCslgH1H2z43uKwEFHo1z0NYSAx3aq2pZQO0Se3GV9fdZU5huyVB/JFfuoc KwPNXA1t8pWFCFQYvYNKu8W+7BIivxhxcd09QOKTZTIj5BInrEoCUHrXptQRczZOi1vM TT/V9tjzZZnGFkk7nTEMilnl34VI+ApAlYxGPQ9a0gXrK34JEUIiuc74vuKus2Ycy+sL jvYO+70U7bGDxptg1zO+D5eH2HnGiqHqt1IC7T/dmLVKEVD5DgJoEU0XR5obLiDw55UD 6C7fpxm2ieOS62MSyVi5bZTqwOy0XTHs/7AJdaG4e62qBHVnqbsJZtNn5nrVMfZYtOE7 msIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=P0jzWU4zAG/WKscq43D72BaTfHuyQcz6tGJogDiDgmk=; b=BDjtOumLzQpPlSUSNDbT2WZN+w0h69Zv2x7VCcM89STPIocVwwEMbCRN7ftaGIoDeX xcTCg4x/mL14EJRBlNcohos4iLsskzb8gBP1n2FfBRPiyenpX36DBBfqFabTkYeo/fua BEV1vba3VVO3zIG8ticuFsPBevnVAMxqaSOLxAX93rpwB38L3QddvjIXfQ9f/ioEK+gm 1MbwEe9H9ZfwhBhmapWl2GVp0U+/rHEPkmaLR0JW+TfkFGX/8eZGrzv+b2Hslyktgilh YnoKuL0SSDtBmRSo5iUw0lBIn68vKwQQHNJf7xav3vqwC+0lr2hHpMXEIKkFqIiTeTRo 9xPQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i70si16464996pge.476.2019.04.24.01.07.20; Wed, 24 Apr 2019 01:07:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728832AbfDXH4n (ORCPT + 99 others); Wed, 24 Apr 2019 03:56:43 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60372 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727112AbfDXH4n (ORCPT ); Wed, 24 Apr 2019 03:56:43 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 81FB281DFA; Wed, 24 Apr 2019 07:56:41 +0000 (UTC) Received: from gondolin (unknown [10.40.205.6]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9F255600C5; Wed, 24 Apr 2019 07:56:28 +0000 (UTC) Date: Wed, 24 Apr 2019 09:56:24 +0200 From: Cornelia Huck To: Yan Zhao Cc: "cjia@nvidia.com" , "kvm@vger.kernel.org" , "aik@ozlabs.ru" , "Zhengxiao.zx@alibaba-inc.com" , "shuangtai.tst@alibaba-inc.com" , "qemu-devel@nongnu.org" , "kwankhede@nvidia.com" , "eauger@redhat.com" , "Liu, Yi L" , "eskultet@redhat.com" , "Yang, Ziye" , "mlevitsk@redhat.com" , "pasic@linux.ibm.com" , "libvir-list@redhat.com" , "arei.gonglei@huawei.com" , "felipe@nutanix.com" , "Ken.Xue@amd.com" , "Tian, Kevin" , "dgilbert@redhat.com" , "zhenyuw@linux.intel.com" , "alex.williamson@redhat.com" , "intel-gvt-dev@lists.freedesktop.org" , "Liu, Changpeng" , "linux-kernel@vger.kernel.org" , "Wang, Zhi A" , "jonathan.davies@nutanix.com" , "He, Shaopeng" Subject: Re: [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device Message-ID: <20190424095624.0ce97328.cohuck@redhat.com> In-Reply-To: <20190424031036.GB26247@joy-OptiPlex-7040> References: <20190419083258.19580-1-yan.y.zhao@intel.com> <20190419083505.19654-1-yan.y.zhao@intel.com> <20190423115932.42619422.cohuck@redhat.com> <20190424031036.GB26247@joy-OptiPlex-7040> Organization: Red Hat GmbH MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Wed, 24 Apr 2019 07:56:41 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 23 Apr 2019 23:10:37 -0400 Yan Zhao wrote: > On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote: > > On Fri, 19 Apr 2019 04:35:04 -0400 > > Yan Zhao wrote: > > > > > device version attribute in mdev sysfs is used by user space software > > > (e.g. libvirt) to query device compatibility for live migration of VFIO > > > mdev devices. This attribute is mandatory if a mdev device supports live > > > migration. > > > > > > It consists of two parts: common part and vendor proprietary part. > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits > > > identifies device type. e.g., for pci device, it is > > > "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16). > > > vendor proprietary part: this part is varied in length. vendor driver can > > > specify any string to identify a device. > > > > > > When reading this attribute, it should show device version string of the > > > device of type . If a device does not support live migration, it > > > should return errno. > > > > It might make more sense if the driver does not register the attribute > > for the device in that case at all. > > > yes. what about rephrase like this: > " > When reading this attribute, it should show device version string of the > device of type . > If a device does not support live migration, it has two choices: > 1. do not register this version attribute > 2. return -ENODEV on rw this version attribute "return -ENODEV when accessing the version attribute" ? > Choice 1 is preferred. > " > > > > > When writing a string to this attribute, it returns errno for > > > incompatibility or returns written string length in compatibility case. > > > If a device does not support live migration, it always returns errno. > > > > > > For user space software to use: > > > 1. > > > Before starting live migration, user space software first reads source side > > > mdev device's version. e.g. > > > "#cat \ > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version" > > > 00028086-193b-i915-GVTg_V5_4 > > > > > > 2. > > > Then, user space software writes the source side returned version string > > > to device version attribute in target side, and checks the return value. > > > If a negative errno is returned in the target side, then mdev devices in > > > source and target sides are not compatible; > > > If a positive number is returned and it equals to the length of written > > > string, then the two mdev devices in source and target side are compatible. > > > e.g. > > > (a) compatibility case > > > "# echo 00028086-193b-i915-GVTg_V5_4 > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version" > > > > > > (b) incompatibility case > > > "#echo 00028086-193b-i915-GVTg_V5_1 > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version" > > > -bash: echo: write error: Invalid argument > > > > > > 3. if two mdev devices are compatible, user space software can start > > > live migration, and vice versa. > > > > > > Note: if a mdev device does not support live migration, it either does > > > not provide a version attribute, or always returns errno when its version > > > attribute is read/written. > > > > > > Cc: Alex Williamson > > > Cc: Erik Skultety > > > Cc: "Dr. David Alan Gilbert" > > > Cc: Cornelia Huck > > > Cc: "Tian, Kevin" > > > Cc: Zhenyu Wang > > > Cc: "Wang, Zhi A" > > > Cc: Neo Jia > > > Cc: Kirti Wankhede > > > > > > Signed-off-by: Yan Zhao > > > --- > > > Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++ > > > samples/vfio-mdev/mbochs.c | 17 ++++++++++++ > > > samples/vfio-mdev/mdpy.c | 16 ++++++++++++ > > > samples/vfio-mdev/mtty.c | 16 ++++++++++++ > > > 4 files changed, 85 insertions(+) > > > > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt > > > index c3f69bcaf96e..bc28471c0667 100644 > > > --- a/Documentation/vfio-mediated-device.txt > > > +++ b/Documentation/vfio-mediated-device.txt > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device > > > | | |--- available_instances > > > | | |--- device_api > > > | | |--- description > > > + | | |--- version > > > | | |--- [devices] > > > | |--- [] > > > | | |--- create > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device > > > | | |--- available_instances > > > | | |--- device_api > > > | | |--- description > > > + | | |--- version > > > | | |--- [devices] > > > | |--- [] > > > | |--- create > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device > > > | |--- available_instances > > > | |--- device_api > > > | |--- description > > > + | |--- version > > > | |--- [devices] > > > > > > * [mdev_supported_types] > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device > > > [], device_api, and available_instances are mandatory attributes > > > that should be provided by vendor driver. > > > > > > + version is a mandatory attribute if a mdev device supports live migration. > > > > What about "An mdev device wishing to support live migration must > > provide the version attribute."? > yes, I just want to keep consistent with the line above it > " [], device_api, and available_instances are mandatory attributes > that should be provided by vendor driver." > what about below one? > "version is a mandatory attribute if a mdev device wishing to support live > migration." My point is that an attribute is not mandatory if it can be left out :) (I'm not a native speaker, though; maybe this makes perfect sense after all?) Maybe "version is a required attribute if live migration is supported for an mdev device"? > > > > > + > > > * [] > > > > > > The [] name is created by adding the device driver string as a prefix > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device > > > This attribute should show the number of devices of type that can be > > > created. > > > > > > +* version > > > + > > > + This attribute is rw. It is used to check whether two devices are compatible > > > + for live migration. If this attribute is missing, then the corresponding mdev > > > + device is regarded as not supporting live migration. > > > + > > > + It consists of two parts: common part and vendor proprietary part. > > > + common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies > > > + device type. e.g., for pci device, it is > > > + "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16). > > > + vendor proprietary part: this part is varied in length. vendor driver can > > > + specify any string to identify a device. > > > + > > > + When reading this attribute, it should show device version string of the device > > > + of type . If a device does not support live migration, it should > > > + return errno. > > > + When writing a string to this attribute, it returns errno for incompatibility > > > + or returns written string length in compatibility case. If a device does not > > > + support live migration, it always returns errno. > > > > I'm not sure whether a device that does not support live migration > > should expose this attribute in the first place. Or is that to cover > > cases where a driver supports live migration only for some of the > > devices it supports? > yes, driver returning error code is to cover the cases where only part of devices it > supports can be migrated. > > > > Also, I'm not sure if a string that has to be parsed is a good idea... > > is this 'version' attribute supposed to convey some human-readable > > information as well? The procedure you describe for compatibility > > checking does the checking within the vendor driver which I would > > expect to have a table/rules for that anyway. > right. if a vendor driver has the confidence to migrate between devices of > diffent platform or mdev types, it can maintain a compatibility table for that > purpose. That's the reason why we would leave the compatibility check to vendor > driver. vendor driver can freely choose its own complicated way to decide > which device is migratable to which device. I think there are two scenarios here: - Migrating between different device types, which is unlikely to work, except in special cases. - Migrating between different versions of the same device type, which may work for some drivers/devices (and at least migrating to a newer version looks quite reasonable). But both should be something that is decided by the individual driver; I hope we don't want to support migration between different drivers :-O Can we make this a driver-defined format? > > > I think you should also specify which errno writing an incompatible id > > is supposed to return (probably best something different than if the > > device does not support live migration at all, if we stick with > > creating the attribute in that case.) > Agree. I'll define it clearly in next revison. Thanks!