Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1890478yba; Fri, 10 May 2019 02:50:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqz5p3HN9QGT12mtcByr5vvngnCmNd7S0Shru6jLIO42Oz2u40Vl6Q+8cYfKcI8oClo5iL1Q X-Received: by 2002:a17:902:b614:: with SMTP id b20mr11219772pls.200.1557481838101; Fri, 10 May 2019 02:50:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557481838; cv=none; d=google.com; s=arc-20160816; b=kj1qx3EWLws2b3lQQrL3Vajqx76bc+pPwzyZlxSCRq2axk/Pl1ijfj02t23DVrK/h4 n2TOCfvVB8ee/KPsQunWbIMeuff7fnpSHqjJEHWU15gtjQkfYyh/vgDB2GQSn+WIPVKm pvcTCKoBmLZC57ve90nym6PBWPHhnmJOJnjvUzZHBJKLtzt09lVVBm9w7nXl7DqcnoUk 0ESunmJpwcuoTxHK9Ol/G25LsG7XIfUxfV7R2C9D5SqOv23zZoWZbiqdncMgju9aA/hu TV1/CngFDdenetcoHhEfxgYxLOQyTDRAIV1Za7jceykPcE7blR8t4Utp0nhWLZ/yGQ8p Y/1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=ILCa4Q42YLDVB6pObQ7sdRCqXR/+5R88Bk3fb/tLe+4=; b=B8s63X0KKwhHPc55RpE2Jl6eweGN8fnnJfOn0CwYHT23Kov9b3VFeTe9m2LIKbbqhn pfhlYKcCTvGTbmaQUefC2pW9bbz+aVIDQ2bN9xIwZQgvGJiR/K5/mtMG6EkxgOhcgau3 ZIs6egunxWhvrlcjLgRc65/NjdTOn/2rIBMWly+zuSFMXAdK4KzggSu/ET8Mp9gy4RA1 IbBTT0zW8loVayhMnQQjpkXdPjM1QHuGZj88oabgP7U8oQkOGeYVFqZ3FeoA/W5j+Ijf q71Uonpllptyq4eMLVjV5IDzkDULG2V75P87i/nMjdgC86aFgIxM01gmfuC5DhM/8eVQ rafQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y14si6760338pga.217.2019.05.10.02.50.22; Fri, 10 May 2019 02:50:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727315AbfEJJsy (ORCPT + 99 others); Fri, 10 May 2019 05:48:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60198 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727145AbfEJJsy (ORCPT ); Fri, 10 May 2019 05:48:54 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1B8AC85376; Fri, 10 May 2019 09:48:53 +0000 (UTC) Received: from gondolin (dhcp-192-213.str.redhat.com [10.33.192.213]) by smtp.corp.redhat.com (Postfix) with ESMTP id 766EF600C7; Fri, 10 May 2019 09:48:40 +0000 (UTC) Date: Fri, 10 May 2019 11:48:38 +0200 From: Cornelia Huck To: "Dr. David Alan Gilbert" Cc: Alex Williamson , Yan Zhao , intel-gvt-dev@lists.freedesktop.org, arei.gonglei@huawei.com, aik@ozlabs.ru, Zhengxiao.zx@alibaba-inc.com, shuangtai.tst@alibaba-inc.com, qemu-devel@nongnu.org, eauger@redhat.com, yi.l.liu@intel.com, ziye.yang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com, felipe@nutanix.com, changpeng.liu@intel.com, Ken.Xue@amd.com, jonathan.davies@nutanix.com, shaopeng.he@intel.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, libvir-list@redhat.com, eskultet@redhat.com, kevin.tian@intel.com, zhenyuw@linux.intel.com, zhi.a.wang@intel.com, cjia@nvidia.com, kwankhede@nvidia.com, berrange@redhat.com, dinechin@redhat.com Subject: Re: [PATCH v2 1/2] vfio/mdev: add version attribute for mdev device Message-ID: <20190510114838.7e16c3d6.cohuck@redhat.com> In-Reply-To: <20190510093608.GD2854@work-vm> References: <20190506014514.3555-1-yan.y.zhao@intel.com> <20190506014904.3621-1-yan.y.zhao@intel.com> <20190507151826.502be009@x1.home> <20190509173839.2b9b2b46.cohuck@redhat.com> <20190509154857.GF2868@work-vm> <20190509175404.512ae7aa.cohuck@redhat.com> <20190509164825.GG2868@work-vm> <20190510110838.2df4c4d0.cohuck@redhat.com> <20190510093608.GD2854@work-vm> Organization: Red Hat GmbH MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 10 May 2019 09:48:53 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 10 May 2019 10:36:09 +0100 "Dr. David Alan Gilbert" wrote: > * Cornelia Huck (cohuck@redhat.com) wrote: > > On Thu, 9 May 2019 17:48:26 +0100 > > "Dr. David Alan Gilbert" wrote: > > > > > * Cornelia Huck (cohuck@redhat.com) wrote: > > > > On Thu, 9 May 2019 16:48:57 +0100 > > > > "Dr. David Alan Gilbert" wrote: > > > > > > > > > * Cornelia Huck (cohuck@redhat.com) wrote: > > > > > > On Tue, 7 May 2019 15:18:26 -0600 > > > > > > Alex Williamson wrote: > > > > > > > > > > > > > On Sun, 5 May 2019 21:49:04 -0400 > > > > > > > Yan Zhao wrote: > > > > > > > > > > > > > > + Errno: > > > > > > > > + If vendor driver wants to claim a mdev device incompatible to all other mdev > > > > > > > > + devices, it should not register version attribute for this mdev device. But if > > > > > > > > + a vendor driver has already registered version attribute and it wants to claim > > > > > > > > + a mdev device incompatible to all other mdev devices, it needs to return > > > > > > > > + -ENODEV on access to this mdev device's version attribute. > > > > > > > > + If a mdev device is only incompatible to certain mdev devices, write of > > > > > > > > + incompatible mdev devices's version strings to its version attribute should > > > > > > > > + return -EINVAL; > > > > > > > > > > > > > > I think it's best not to define the specific errno returned for a > > > > > > > specific situation, let the vendor driver decide, userspace simply > > > > > > > needs to know that an errno on read indicates the device does not > > > > > > > support migration version comparison and that an errno on write > > > > > > > indicates the devices are incompatible or the target doesn't support > > > > > > > migration versions. > > > > > > > > > > > > I think I have to disagree here: It's probably valuable to have an > > > > > > agreed error for 'cannot migrate at all' vs 'cannot migrate between > > > > > > those two particular devices'. Userspace might want to do different > > > > > > things (e.g. trying with different device pairs). > > > > > > > > > > Trying to stuff these things down an errno seems a bad idea; we can't > > > > > get much information that way. > > > > > > > > So, what would be a reasonable approach? Userspace should first read > > > > the version attributes on both devices (to find out whether migration > > > > is supported at all), and only then figure out via writing whether they > > > > are compatible? > > > > > > > > (Or just go ahead and try, if it does not care about the reason.) > > > > > > Well, I'm OK with something like writing to test whether it's > > > compatible, it's just we need a better way of saying 'no'. > > > I'm not sure if that involves reading back from somewhere after > > > the write or what. > > > > Hm, so I basically see two ways of doing that: > > - standardize on some error codes... problem: error codes can be hard > > to fit to reasons > > - make the error available in some attribute that can be read > > > > I'm not sure how we can serialize the readback with the last write, > > though (this looks inherently racy). > > > > How important is detailed error reporting here? > > I think we need something, otherwise we're just going to get vague > user reports of 'but my VM doesn't migrate'; I'd like the error to be > good enough to point most users to something they can understand > (e.g. wrong card family/too old a driver etc). Ok, that sounds like a reasonable point. Not that I have a better idea how to achieve that, though... we could also log a more verbose error message to the kernel log, but that's not necessarily where a user will look first. Ideally, we'd want to have the user space program setting up things querying the general compatibility for migration (so that it becomes their problem on how to alert the user to problems :), but I'm not sure how to eliminate the race between asking the vendor driver for compatibility and getting the result of that operation. Unless we introduce an interface that can retrieve _all_ results together with the written value? Or is that not going to be much of a problem in practice?