Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751327AbdHAFym (ORCPT ); Tue, 1 Aug 2017 01:54:42 -0400 Received: from mga03.intel.com ([134.134.136.65]:46190 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750716AbdHAFyk (ORCPT ); Tue, 1 Aug 2017 01:54:40 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,305,1498546800"; d="scan'208";a="133740768" Subject: Re: [RFC]Add new mdev interface for QoS To: Alex Williamson References: <9951f9cf-89dd-afa4-a9f7-9a795e4c01af@intel.com> <20170726104343.5bfa51d5@w520.home> <9607b33d-7b3a-1bcf-1ad9-4b554100e68a@intel.com> Cc: kwankhede@nvidia.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, "Tian, Kevin" , Zhenyu Wang , Jike Song , libvir-list@redhat.com, zhi.a.wang@intel.com From: "Gao, Ping A" Message-ID: Date: Tue, 1 Aug 2017 13:54:27 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <9607b33d-7b3a-1bcf-1ad9-4b554100e68a@intel.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4697 Lines: 91 On 2017/7/28 0:00, Gao, Ping A wrote: > On 2017/7/27 0:43, Alex Williamson wrote: >> [cc +libvir-list] >> >> On Wed, 26 Jul 2017 21:16:59 +0800 >> "Gao, Ping A" wrote: >> >>> The vfio-mdev provide the capability to let different guest share the >>> same physical device through mediate sharing, as result it bring a >>> requirement about how to control the device sharing, we need a QoS >>> related interface for mdev to management virtual device resource. >>> >>> E.g. In practical use, vGPUs assigned to different quests almost has >>> different performance requirements, some guests may need higher priority >>> for real time usage, some other may need more portion of the GPU >>> resource to get higher 3D performance, corresponding we can define some >>> interfaces like weight/cap for overall budget control, priority for >>> single submission control. >>> >>> So I suggest to add some common attributes which are vendor agnostic in >>> mdev core sysfs for QoS purpose. >> I think what you're asking for is just some standardization of a QoS >> attribute_group which a vendor can optionally include within the >> existing mdev_parent_ops.mdev_attr_groups. The mdev core will >> transparently enable this, but it really only provides the standard, >> all of the support code is left for the vendor. I'm fine with that, >> but of course the trouble with and sort of standardization is arriving >> at an agreed upon standard. Are there QoS knobs that are generic >> across any mdev device type? Are there others that are more specific >> to vGPU? Are there existing examples of this that we can steal their >> specification? > Yes, you are right, standardization QoS knobs are exactly what I wanted. > Only when it become a part of the mdev framework and libvirt, then QoS > such critical feature can be leveraged by cloud usage. HW vendor only > need to focus on the implementation of the corresponding QoS algorithm > in their back-end driver. > > Vfio-mdev framework provide the capability to share the device that lack > of HW virtualization support to guests, no matter the device type, > mediated sharing actually is a time sharing multiplex method, from this > point of view, QoS can be take as a generic way about how to control the > time assignment for virtual mdev device that occupy HW. As result we can > define QoS knob generic across any device type by this way. Even if HW > has build in with some kind of QoS support, I think it's not a problem > for back-end driver to convert mdev standard QoS definition to their > specification to reach the same performance expectation. Seems there are > no examples for us to follow, we need define it from scratch. > > I proposal universal QoS control interfaces like below: > > Cap: The cap limits the maximum percentage of time a mdev device can own > physical device. e.g. cap=60, means mdev device cannot take over 60% of > total physical resource. > > Weight: The weight define proportional control of the mdev device > resource between guests, it?s orthogonal with Cap, to target load > balancing. E.g. if guest 1 should take double mdev device resource > compare with guest 2, need set weight ratio to 2:1. > > Priority: The guest who has higher priority will get execution first, > target to some real time usage and speeding interactive response. > > Above QoS interfaces cover both overall budget control and single > submission control. I will sent out detail design later once get aligned. Hi Alex, Any comments about the interface mentioned above? >> Also, mdev devices are not necessarily the exclusive users of the >> hardware, we can have a native user such as a local X client. They're >> not an mdev user, so we can't support them via the mdev_attr_group. >> Does there need to be a per mdev parent QoS attribute_group standard >> for somehow defining the QoS of all the child mdev devices, or perhaps >> representing the remaining host QoS attributes? > That's really an open, if we don't take host workload into consideration > for cloud usage, it's not a problem any more, however such assumption is > not reasonable. Any way if we take mdev devices as clients of host > driver, and host driver provide the capability to divide out a portion > HW resource to mdev devices, then it's only need to take care about the > resource that host assigned for mdev devices. Follow this way QoS for > mdev focus on the relationship between mdev devices no need to take care > the host workload. > > -Ping > >> Ultimately libvirt and upper level management tools would be the >> consumer of these control knobs, so let's immediately get libvirt >> involved in the discussion. Thanks, >> >> Alex