2014-12-04 02:27:20

by Jike Song

[permalink] [raw]
Subject: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

Hi all,

We are pleased to announce the first release of KVMGT project. KVMGT is the implementation of Intel GVT-g technology, a full GPU virtualization solution. Under Intel GVT-g, a virtual GPU instance is maintained for each VM, with part of performance critical resources directly assigned. The capability of running native graphics driver inside a VM, without hypervisor intervention in performance critical paths, achieves a good balance of performance, feature, and sharing capability.


KVMGT is still in the early stage:

- Basic functions of full GPU virtualization works, guest can see a full-featured vGPU.
We ran several 3D workloads such as lightsmark, nexuiz, urbanterror and warsow.

- Only Linux guest supported so far, and PPGTT must be disabled in guest through a
kernel parameter(see README.kvmgt in QEMU).

- This drop also includes some Xen specific changes, which will be cleaned up later.

- Our end goal is to upstream both XenGT and KVMGT, which shares ~90% logic for vGPU
device model (will be part of i915 driver), with only difference in hypervisor
specific services

- insufficient test coverage, so please bear with stability issues :)



There are things need to be improved, esp. the KVM interfacing part:

1 a domid was added to each KVMGT guest

An ID is needed for foreground OS switching, e.g.

# echo <domid> > /sys/kernel/vgt/control/foreground_vm

domid 0 is reserved for host OS.


2 SRCU workarounds.

Some KVM functions, such as:

kvm_io_bus_register_dev
install_new_memslots

must be called *without* &kvm->srcu read-locked. Otherwise it hangs.

In KVMGT, we need to register an iodev only *after* BAR registers are
written by guest. That means, we already have &kvm->srcu hold -
trapping/emulating PIO(BAR registers) makes us in such a condition.
That will make kvm_io_bus_register_dev hangs.

Currently we have to disable rcu_assign_pointer() in such functions.

These were dirty workarounds, your suggestions are high welcome!


3 syscalls were called to access "/dev/mem" from kernel

An in-kernel memslot was added for aperture, but using syscalls like
open and mmap to open and access the character device "/dev/mem",
for pass-through.




The source codes(kernel, qemu as well as seabios) are available at github:

git://github.com/01org/KVMGT-kernel
git://github.com/01org/KVMGT-qemu
git://github.com/01org/KVMGT-seabios

In the KVMGT-qemu repository, there is a "README.kvmgt" to be referred.



More information about Intel GVT-g and KVMGT can be found at:

https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian
http://events.linuxfoundation.org/sites/events/files/slides/KVMGT-a%20Full%20GPU%20Virtualization%20Solution_1.pdf


Appreciate your comments, BUG reports, and contributions!




--
Thanks,
Jike


2014-12-05 08:50:30

by Gerd Hoffmann

[permalink] [raw]
Subject: Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

Hi,

> In KVMGT, we need to register an iodev only *after* BAR registers are
> written by guest.

Oh, the guest can write the bar register at any time. Typically it
happens at boot only, but it can also happen at runtime, for example on
reboot.

I've also seen the kernel redoing the pci mappings created by the bios,
due to buggy _crs declarations in the qemu acpi tables.

> https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian

/me goes read this.

A few comments on the kernel stuff (brief look so far, also
compile-tested only, intel gfx on my test machine is too old).

* Noticed the kernel bits don't even compile when configured as
module. Everything (vgt, i915, kvm) must be compiled into the
kernel.
* Design approach still seems to be i915 on vgt not the other way
around.

Qemu/SeaBIOS bits:

I've seen the host bridge changes identity from i440fx to
copy-pci-ids-from-host. Guess the reason for this is that seabios uses
this device to figure whenever it is running on i440fx or q35. Correct?

What are the exact requirements for the device? Must it match the host
exactly, to not confuse the guest intel graphics driver? Or would
something more recent -- such as the q35 emulation qemu has -- be good
enough to make things work (assuming we add support for the
graphic-related pci config space registers there)?

The patch also adds a dummy isa bridge at 0x1f. Simliar question here:
What exactly is needed here? Would things work if we simply use the q35
lpc device here?

more to come after I've read the paper linked above ...

cheers,
Gerd

2014-12-05 13:53:52

by Daniel Vetter

[permalink] [raw]
Subject: Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

On Fri, Dec 05, 2014 at 09:50:21AM +0100, Gerd Hoffmann wrote:
> > https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian
>
> /me goes read this.
>
> A few comments on the kernel stuff (brief look so far, also
> compile-tested only, intel gfx on my test machine is too old).
>
> * Noticed the kernel bits don't even compile when configured as
> module. Everything (vgt, i915, kvm) must be compiled into the
> kernel.
> * Design approach still seems to be i915 on vgt not the other way
> around.

Yeah done a quick read-through of just the i915 bits too, same comment. I
guess this is just the first RFC and the redesign we've discussed about
already with xengt is in progress somewhere?

Thanks, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2014-12-05 13:03:41

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM



On 05/12/2014 09:50, Gerd Hoffmann wrote:
> A few comments on the kernel stuff (brief look so far, also
> compile-tested only, intel gfx on my test machine is too old).
>
> * Noticed the kernel bits don't even compile when configured as
> module. Everything (vgt, i915, kvm) must be compiled into the
> kernel.

I'll add that the patch is basically impossible to review with all the
XenGT bits still in. For example, the x86 emulator seems to be
unnecessary for KVMGT, but I am not 100% sure.

I would like a clear understanding of why/how Andrew Barnes was able to
do i915 passthrough (GVT-d) without hacking the ISA bridge, and why this
does not apply to GVT-g.

Paolo

> * Design approach still seems to be i915 on vgt not the other way
> around.
>
> Qemu/SeaBIOS bits:
>
> I've seen the host bridge changes identity from i440fx to
> copy-pci-ids-from-host. Guess the reason for this is that seabios uses
> this device to figure whenever it is running on i440fx or q35. Correct?
>
> What are the exact requirements for the device? Must it match the host
> exactly, to not confuse the guest intel graphics driver? Or would
> something more recent -- such as the q35 emulation qemu has -- be good
> enough to make things work (assuming we add support for the
> graphic-related pci config space registers there)?
>
> The patch also adds a dummy isa bridge at 0x1f. Simliar question here:
> What exactly is needed here? Would things work if we simply use the q35
> lpc device here?
>
> more to come after I've read the paper linked above ...
>
> cheers,
> Gerd
>
>
> _______________________________________________
> Intel-gfx mailing list
> [email protected]
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

2014-12-06 04:20:30

by Jike Song

[permalink] [raw]
Subject: Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

On 12/05/2014 04:50 PM, Gerd Hoffmann wrote:
> A few comments on the kernel stuff (brief look so far, also
> compile-tested only, intel gfx on my test machine is too old).
>
> * Noticed the kernel bits don't even compile when configured as
> module. Everything (vgt, i915, kvm) must be compiled into the
> kernel.

Yes, that's planned to be done along with separating hypervisor-related
code from vgt.

> * Design approach still seems to be i915 on vgt not the other way
> around.

So far yes.

>
> Qemu/SeaBIOS bits:
>
> I've seen the host bridge changes identity from i440fx to
> copy-pci-ids-from-host. Guess the reason for this is that seabios uses
> this device to figure whenever it is running on i440fx or q35. Correct?
>

I did some trick in seabios/qemu. The purpose is to make qemu:

- provide IDs of an old host bridge to SeaBIOS
- provide IDs of new host bridge(the physical ones) to guest OS

So I made seabios to tell qemu that POST is done before jumping to guest
OS context.

This may be the simplest method to make things work, but yes, q35 emulation
of qemu may have this unnecessary, see below.

> What are the exact requirements for the device? Must it match the host
> exactly, to not confuse the guest intel graphics driver? Or would
> something more recent -- such as the q35 emulation qemu has -- be good
> enough to make things work (assuming we add support for the
> graphic-related pci config space registers there)?
>

I don't know that is exactly needed, we also need to have Windows
driver considered. However, I'm quite confident that, if things gonna
work for IGD passthrough, it gonna work for GVT-g.

> The patch also adds a dummy isa bridge at 0x1f. Simliar question here:
> What exactly is needed here? Would things work if we simply use the q35
> lpc device here?
>

Ditto.

> more to come after I've read the paper linked above ...

Thanks for review :)

>
> cheers,
> Gerd
>

--
Thanks,
Jike

2014-12-06 04:33:32

by Jike Song

[permalink] [raw]
Subject: Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

CC Andy :)

On 12/05/2014 09:03 PM, Paolo Bonzini wrote:
>
> On 05/12/2014 09:50, Gerd Hoffmann wrote:
>> A few comments on the kernel stuff (brief look so far, also
>> compile-tested only, intel gfx on my test machine is too old).
>>
>> * Noticed the kernel bits don't even compile when configured as
>> module. Everything (vgt, i915, kvm) must be compiled into the
>> kernel.
>
> I'll add that the patch is basically impossible to review with all the
> XenGT bits still in. For example, the x86 emulator seems to be
> unnecessary for KVMGT, but I am not 100% sure.
>

This is not ready for merge yet, please wait for a while, we'll have
Xen/KVM specific code separated.

BTW, definitely you are right, the emulator is unnecessary for KVMGT,
and ... unnecessary for XenGT :)

> I would like a clear understanding of why/how Andrew Barnes was able to
> do i915 passthrough (GVT-d) without hacking the ISA bridge, and why this
> does not apply to GVT-g.

AFAIK, the graphics drivers need to figure out the offset of
some MMIO registers, by the IDs of this ISA bridge. It simply won't work
without this information.

Talked with Andy about the pass-through but I don't have his implementation,
CC Andy for his advice :)

>
> Paolo
>

Thanks for review. Would you please also have a look at the issues I mentioned
in the original email? they are most KVM-related: the SRCU trickiness, domid,
and the memslot created in kernel.

Thank you!

--
Thanks,
Jike

2014-12-06 04:35:35

by Jike Song

[permalink] [raw]
Subject: Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

On 12/05/2014 09:54 PM, Daniel Vetter wrote:
> Yeah done a quick read-through of just the i915 bits too, same comment. I
> guess this is just the first RFC and the redesign we've discussed about
> already with xengt is in progress somewhere?

Yes, it's marching on with Xen now. The KVM implementation is
currently not even feature complete - we still have PPGTT missing.


>
> Thanks, Daniel
>

--
Thanks,
Jike

2014-12-08 09:55:11

by Gerd Hoffmann

[permalink] [raw]
Subject: Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

On Sa, 2014-12-06 at 12:17 +0800, Jike Song wrote:
> On 12/05/2014 04:50 PM, Gerd Hoffmann wrote:
> > A few comments on the kernel stuff (brief look so far, also
> > compile-tested only, intel gfx on my test machine is too old).
> >
> > * Noticed the kernel bits don't even compile when configured as
> > module. Everything (vgt, i915, kvm) must be compiled into the
> > kernel.
>
> Yes, that's planned to be done along with separating hypervisor-related
> code from vgt.

Good.

> > What are the exact requirements for the device? Must it match the host
> > exactly, to not confuse the guest intel graphics driver? Or would
> > something more recent -- such as the q35 emulation qemu has -- be good
> > enough to make things work (assuming we add support for the
> > graphic-related pci config space registers there)?
> >
>
> I don't know that is exactly needed, we also need to have Windows
> driver considered. However, I'm quite confident that, if things gonna
> work for IGD passthrough, it gonna work for GVT-g.

I'd suggest to focus on q35 emulation. q35 is new enough that a version
with integrated graphics exists, so the gap we have to close is *much*
smaller.

In case guests expect a northbridge matching the chipset generation of
the graphics device (which I'd expect is the case, after digging a bit
in the igd and agpgart linux driver code) I think we should add proper
device emulation for them, i.e. comply q35-pcihost with
sandybridge-pcihost + ivybridge-pcihost + haswell-pcihost instead of
just copying over the pci ids from the host. Most likely all those
variants can share most of the emulation code.

SeaBIOS then can just get support for these three northbridge variants,
so we don't need magic pci id switching hacks at all.

> > The patch also adds a dummy isa bridge at 0x1f. Simliar question here:
> > What exactly is needed here? Would things work if we simply use the q35
> > lpc device here?

> Ditto.

Ok. Lets try to just use the q35 emulation + q35 lpc device then
instead of adding a second dummy lpc device.

cheers,
Gerd

2014-12-08 10:20:20

by Daniel Vetter

[permalink] [raw]
Subject: Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

On Mon, Dec 08, 2014 at 10:55:01AM +0100, Gerd Hoffmann wrote:
> On Sa, 2014-12-06 at 12:17 +0800, Jike Song wrote:
> > I don't know that is exactly needed, we also need to have Windows
> > driver considered. However, I'm quite confident that, if things gonna
> > work for IGD passthrough, it gonna work for GVT-g.
>
> I'd suggest to focus on q35 emulation. q35 is new enough that a version
> with integrated graphics exists, so the gap we have to close is *much*
> smaller.
>
> In case guests expect a northbridge matching the chipset generation of
> the graphics device (which I'd expect is the case, after digging a bit
> in the igd and agpgart linux driver code) I think we should add proper
> device emulation for them, i.e. comply q35-pcihost with
> sandybridge-pcihost + ivybridge-pcihost + haswell-pcihost instead of
> just copying over the pci ids from the host. Most likely all those
> variants can share most of the emulation code.

I don't think i915.ko should care about either northbridge nor pch on
para-virtualized platforms. We do noodle around in there for the oddball
memory controller setting and for some display stuff. But neither of that
really applies to paravirtualized hw. And if there's any case like that we
should patch it out (like we do with some of the runtime pm code
already).
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2014-12-09 02:49:53

by Tian, Kevin

[permalink] [raw]
Subject: RE: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

Here is some background of this KVMGT release:

- the major purpose is for early experiment of this technique in KVM, and
throw out issues about adding in-kernel device model (or mediated pass-through
framework) in KVM.

- KVMGT shares 90% code as XenGT, regarding to vGPU device model. The
only difference is the in-kernel dm interface. The vGPU device model will be
split and integrated in i915 driver. It will register to in-kernel dm framework
provided either by Xen or KVM at boot time. Upstreaming of vGPU device
model is already in progress, with valuable comments received from i915
community. However the refactoring mostly happen in XenGT repo now

- Now we have XenGT/KVMGT separately maintained, and KVMGT lags
behind XenGT regarding to features and qualities. Likely you'll continue
see stale code (like Xen inst decoder) for some time. In the future we
plan to maintain a single kernel repo for both, so KVMGT can share
same quality as XenGT once KVM in-kernel dm framework is stable.

- Regarding to Qemu hacks, KVMGT really doesn't have any different
requirements as what have been discussed for GPU pass-through, e.g.
about ISA bridge. Our implementation is based on an old Qemu repo,
and honestly speaking not cleanly developed, because we know we
can leverage from GPU pass-through support once it's in Qemu. At
that time we'll leverage the same logic with minimal changes to
hook KVMGT mgmt. APIs (e.g. create/destroy a vGPU instance). So
we can ignore this area for now. :-)

Thanks
Kevin

> From: Paolo Bonzini
> Sent: Friday, December 05, 2014 9:04 PM
>
>
>
> On 05/12/2014 09:50, Gerd Hoffmann wrote:
> > A few comments on the kernel stuff (brief look so far, also
> > compile-tested only, intel gfx on my test machine is too old).
> >
> > * Noticed the kernel bits don't even compile when configured as
> > module. Everything (vgt, i915, kvm) must be compiled into the
> > kernel.
>
> I'll add that the patch is basically impossible to review with all the
> XenGT bits still in. For example, the x86 emulator seems to be
> unnecessary for KVMGT, but I am not 100% sure.
>
> I would like a clear understanding of why/how Andrew Barnes was able to
> do i915 passthrough (GVT-d) without hacking the ISA bridge, and why this
> does not apply to GVT-g.
>
> Paolo
>
> > * Design approach still seems to be i915 on vgt not the other way
> > around.
> >
> > Qemu/SeaBIOS bits:
> >
> > I've seen the host bridge changes identity from i440fx to
> > copy-pci-ids-from-host. Guess the reason for this is that seabios uses
> > this device to figure whenever it is running on i440fx or q35. Correct?
> >
> > What are the exact requirements for the device? Must it match the host
> > exactly, to not confuse the guest intel graphics driver? Or would
> > something more recent -- such as the q35 emulation qemu has -- be good
> > enough to make things work (assuming we add support for the
> > graphic-related pci config space registers there)?
> >
> > The patch also adds a dummy isa bridge at 0x1f. Simliar question here:
> > What exactly is needed here? Would things work if we simply use the q35
> > lpc device here?
> >
> > more to come after I've read the paper linked above ...
> >
> > cheers,
> > Gerd
> >
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > [email protected]
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> _______________________________________________
> Intel-gfx mailing list
> [email protected]
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-12-09 02:52:03

by Tian, Kevin

[permalink] [raw]
Subject: RE: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

> From: Daniel Vetter
> Sent: Monday, December 08, 2014 6:21 PM
>
> On Mon, Dec 08, 2014 at 10:55:01AM +0100, Gerd Hoffmann wrote:
> > On Sa, 2014-12-06 at 12:17 +0800, Jike Song wrote:
> > > I don't know that is exactly needed, we also need to have Windows
> > > driver considered. However, I'm quite confident that, if things gonna
> > > work for IGD passthrough, it gonna work for GVT-g.
> >
> > I'd suggest to focus on q35 emulation. q35 is new enough that a version
> > with integrated graphics exists, so the gap we have to close is *much*
> > smaller.
> >
> > In case guests expect a northbridge matching the chipset generation of
> > the graphics device (which I'd expect is the case, after digging a bit
> > in the igd and agpgart linux driver code) I think we should add proper
> > device emulation for them, i.e. comply q35-pcihost with
> > sandybridge-pcihost + ivybridge-pcihost + haswell-pcihost instead of
> > just copying over the pci ids from the host. Most likely all those
> > variants can share most of the emulation code.
>
> I don't think i915.ko should care about either northbridge nor pch on
> para-virtualized platforms. We do noodle around in there for the oddball
> memory controller setting and for some display stuff. But neither of that
> really applies to paravirtualized hw. And if there's any case like that we
> should patch it out (like we do with some of the runtime pm code
> already).

Agree. Now Allen is working on how to avoid those tricky platform
stickiness in Windows gfx driver. We should do same thing in Linux
part too.

Thanks
Kevin
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-12-09 09:54:31

by Jan Kiszka

[permalink] [raw]
Subject: Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

On 2014-12-04 03:24, Jike Song wrote:
> Hi all,
>
> We are pleased to announce the first release of KVMGT project. KVMGT is
> the implementation of Intel GVT-g technology, a full GPU virtualization
> solution. Under Intel GVT-g, a virtual GPU instance is maintained for
> each VM, with part of performance critical resources directly assigned.
> The capability of running native graphics driver inside a VM, without
> hypervisor intervention in performance critical paths, achieves a good
> balance of performance, feature, and sharing capability.
>
>
> KVMGT is still in the early stage:
>
> - Basic functions of full GPU virtualization works, guest can see a
> full-featured vGPU.
> We ran several 3D workloads such as lightsmark, nexuiz, urbanterror
> and warsow.
>
> - Only Linux guest supported so far, and PPGTT must be disabled in
> guest through a
> kernel parameter(see README.kvmgt in QEMU).
>
> - This drop also includes some Xen specific changes, which will be
> cleaned up later.
>
> - Our end goal is to upstream both XenGT and KVMGT, which shares ~90%
> logic for vGPU
> device model (will be part of i915 driver), with only difference in
> hypervisor
> specific services
>
> - insufficient test coverage, so please bear with stability issues :)
>
>
>
> There are things need to be improved, esp. the KVM interfacing part:
>
> 1 a domid was added to each KVMGT guest
>
> An ID is needed for foreground OS switching, e.g.
>
> # echo <domid> > /sys/kernel/vgt/control/foreground_vm
>
> domid 0 is reserved for host OS.
>
>
> 2 SRCU workarounds.
>
> Some KVM functions, such as:
>
> kvm_io_bus_register_dev
> install_new_memslots
>
> must be called *without* &kvm->srcu read-locked. Otherwise it
> hangs.
>
> In KVMGT, we need to register an iodev only *after* BAR
> registers are
> written by guest. That means, we already have &kvm->srcu hold -
> trapping/emulating PIO(BAR registers) makes us in such a condition.
> That will make kvm_io_bus_register_dev hangs.
>
> Currently we have to disable rcu_assign_pointer() in such
> functions.
>
> These were dirty workarounds, your suggestions are high welcome!
>
>
> 3 syscalls were called to access "/dev/mem" from kernel
>
> An in-kernel memslot was added for aperture, but using syscalls
> like
> open and mmap to open and access the character device "/dev/mem",
> for pass-through.
>
>
>
>
> The source codes(kernel, qemu as well as seabios) are available at github:
>
> git://github.com/01org/KVMGT-kernel
> git://github.com/01org/KVMGT-qemu
> git://github.com/01org/KVMGT-seabios
>
> In the KVMGT-qemu repository, there is a "README.kvmgt" to be referred.
>
>
>
> More information about Intel GVT-g and KVMGT can be found at:
>
> https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian
>
> http://events.linuxfoundation.org/sites/events/files/slides/KVMGT-a%20Full%20GPU%20Virtualization%20Solution_1.pdf
>
>
>
> Appreciate your comments, BUG reports, and contributions!
>

There is an even increasing interest to keep KVM's in-kernel guest
interface as small as possible, specifically for security reasons. I'm
sure there are some good performance reasons to create a new in-kernel
device model, but I suppose those will need good evidences why things
are done in the way they finally should be - and not via a user-space
device model. This is likely not a binary decision (all userspace vs. no
userspace), it is more about the size and robustness of the in-kernel
model vs. its performance.

One aspect could also be important: Are there hardware improvements in
sight that will eventually help to reduce the in-kernel device model and
make the overall design even more robust? How will those changes fit
best into a proposed user/kernel split?

Jan

--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

2014-12-10 06:34:07

by Jike Song

[permalink] [raw]
Subject: Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

CC Kevin.


On 12/09/2014 05:54 PM, Jan Kiszka wrote:
> On 2014-12-04 03:24, Jike Song wrote:
>> Hi all,
>>
>> We are pleased to announce the first release of KVMGT project. KVMGT is
>> the implementation of Intel GVT-g technology, a full GPU virtualization
>> solution. Under Intel GVT-g, a virtual GPU instance is maintained for
>> each VM, with part of performance critical resources directly assigned.
>> The capability of running native graphics driver inside a VM, without
>> hypervisor intervention in performance critical paths, achieves a good
>> balance of performance, feature, and sharing capability.
>>
>>
>> KVMGT is still in the early stage:
>>
>> - Basic functions of full GPU virtualization works, guest can see a
>> full-featured vGPU.
>> We ran several 3D workloads such as lightsmark, nexuiz, urbanterror
>> and warsow.
>>
>> - Only Linux guest supported so far, and PPGTT must be disabled in
>> guest through a
>> kernel parameter(see README.kvmgt in QEMU).
>>
>> - This drop also includes some Xen specific changes, which will be
>> cleaned up later.
>>
>> - Our end goal is to upstream both XenGT and KVMGT, which shares ~90%
>> logic for vGPU
>> device model (will be part of i915 driver), with only difference in
>> hypervisor
>> specific services
>>
>> - insufficient test coverage, so please bear with stability issues :)
>>
>>
>>
>> There are things need to be improved, esp. the KVM interfacing part:
>>
>> 1 a domid was added to each KVMGT guest
>>
>> An ID is needed for foreground OS switching, e.g.
>>
>> # echo <domid> > /sys/kernel/vgt/control/foreground_vm
>>
>> domid 0 is reserved for host OS.
>>
>>
>> 2 SRCU workarounds.
>>
>> Some KVM functions, such as:
>>
>> kvm_io_bus_register_dev
>> install_new_memslots
>>
>> must be called *without* &kvm->srcu read-locked. Otherwise it
>> hangs.
>>
>> In KVMGT, we need to register an iodev only *after* BAR
>> registers are
>> written by guest. That means, we already have &kvm->srcu hold -
>> trapping/emulating PIO(BAR registers) makes us in such a condition.
>> That will make kvm_io_bus_register_dev hangs.
>>
>> Currently we have to disable rcu_assign_pointer() in such
>> functions.
>>
>> These were dirty workarounds, your suggestions are high welcome!
>>
>>
>> 3 syscalls were called to access "/dev/mem" from kernel
>>
>> An in-kernel memslot was added for aperture, but using syscalls
>> like
>> open and mmap to open and access the character device "/dev/mem",
>> for pass-through.
>>
>>
>>
>>
>> The source codes(kernel, qemu as well as seabios) are available at github:
>>
>> git://github.com/01org/KVMGT-kernel
>> git://github.com/01org/KVMGT-qemu
>> git://github.com/01org/KVMGT-seabios
>>
>> In the KVMGT-qemu repository, there is a "README.kvmgt" to be referred.
>>
>>
>>
>> More information about Intel GVT-g and KVMGT can be found at:
>>
>> https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian
>>
>> http://events.linuxfoundation.org/sites/events/files/slides/KVMGT-a%20Full%20GPU%20Virtualization%20Solution_1.pdf
>>
>>
>>
>> Appreciate your comments, BUG reports, and contributions!
>>
>
> There is an even increasing interest to keep KVM's in-kernel guest
> interface as small as possible, specifically for security reasons. I'm
> sure there are some good performance reasons to create a new in-kernel
> device model, but I suppose those will need good evidences why things
> are done in the way they finally should be - and not via a user-space
> device model. This is likely not a binary decision (all userspace vs. no
> userspace), it is more about the size and robustness of the in-kernel
> model vs. its performance.
>
> One aspect could also be important: Are there hardware improvements in
> sight that will eventually help to reduce the in-kernel device model and
> make the overall design even more robust? How will those changes fit
> best into a proposed user/kernel split?
>
> Jan
>

--
Thanks,
Jike

2014-12-10 06:37:39

by Jike Song

[permalink] [raw]
Subject: Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

CC Kevin.


On 12/09/2014 05:54 PM, Jan Kiszka wrote:
> On 2014-12-04 03:24, Jike Song wrote:
>> Hi all,
>>
>> We are pleased to announce the first release of KVMGT project. KVMGT is
>> the implementation of Intel GVT-g technology, a full GPU virtualization
>> solution. Under Intel GVT-g, a virtual GPU instance is maintained for
>> each VM, with part of performance critical resources directly assigned.
>> The capability of running native graphics driver inside a VM, without
>> hypervisor intervention in performance critical paths, achieves a good
>> balance of performance, feature, and sharing capability.
>>
>>
>> KVMGT is still in the early stage:
>>
>> - Basic functions of full GPU virtualization works, guest can see a
>> full-featured vGPU.
>> We ran several 3D workloads such as lightsmark, nexuiz, urbanterror
>> and warsow.
>>
>> - Only Linux guest supported so far, and PPGTT must be disabled in
>> guest through a
>> kernel parameter(see README.kvmgt in QEMU).
>>
>> - This drop also includes some Xen specific changes, which will be
>> cleaned up later.
>>
>> - Our end goal is to upstream both XenGT and KVMGT, which shares ~90%
>> logic for vGPU
>> device model (will be part of i915 driver), with only difference in
>> hypervisor
>> specific services
>>
>> - insufficient test coverage, so please bear with stability issues :)
>>
>>
>>
>> There are things need to be improved, esp. the KVM interfacing part:
>>
>> 1 a domid was added to each KVMGT guest
>>
>> An ID is needed for foreground OS switching, e.g.
>>
>> # echo <domid> > /sys/kernel/vgt/control/foreground_vm
>>
>> domid 0 is reserved for host OS.
>>
>>
>> 2 SRCU workarounds.
>>
>> Some KVM functions, such as:
>>
>> kvm_io_bus_register_dev
>> install_new_memslots
>>
>> must be called *without* &kvm->srcu read-locked. Otherwise it
>> hangs.
>>
>> In KVMGT, we need to register an iodev only *after* BAR
>> registers are
>> written by guest. That means, we already have &kvm->srcu hold -
>> trapping/emulating PIO(BAR registers) makes us in such a condition.
>> That will make kvm_io_bus_register_dev hangs.
>>
>> Currently we have to disable rcu_assign_pointer() in such
>> functions.
>>
>> These were dirty workarounds, your suggestions are high welcome!
>>
>>
>> 3 syscalls were called to access "/dev/mem" from kernel
>>
>> An in-kernel memslot was added for aperture, but using syscalls
>> like
>> open and mmap to open and access the character device "/dev/mem",
>> for pass-through.
>>
>>
>>
>>
>> The source codes(kernel, qemu as well as seabios) are available at github:
>>
>> git://github.com/01org/KVMGT-kernel
>> git://github.com/01org/KVMGT-qemu
>> git://github.com/01org/KVMGT-seabios
>>
>> In the KVMGT-qemu repository, there is a "README.kvmgt" to be referred.
>>
>>
>>
>> More information about Intel GVT-g and KVMGT can be found at:
>>
>> https://www.usenix.org/conference/atc14/technical-sessions/presentation/tian
>>
>> http://events.linuxfoundation.org/sites/events/files/slides/KVMGT-a%20Full%20GPU%20Virtualization%20Solution_1.pdf
>>
>>
>>
>> Appreciate your comments, BUG reports, and contributions!
>>
>
> There is an even increasing interest to keep KVM's in-kernel guest
> interface as small as possible, specifically for security reasons. I'm
> sure there are some good performance reasons to create a new in-kernel
> device model, but I suppose those will need good evidences why things
> are done in the way they finally should be - and not via a user-space
> device model. This is likely not a binary decision (all userspace vs. no
> userspace), it is more about the size and robustness of the in-kernel
> model vs. its performance.
>
> One aspect could also be important: Are there hardware improvements in
> sight that will eventually help to reduce the in-kernel device model and
> make the overall design even more robust? How will those changes fit
> best into a proposed user/kernel split?
>
> Jan
>

--
Thanks,
Jike
_______________________________________________
Intel-gfx mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

2014-12-10 07:28:34

by Tian, Kevin

[permalink] [raw]
Subject: RE: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

> From: Song, Jike
> Sent: Wednesday, December 10, 2014 2:34 PM
>
> CC Kevin.
>
>
> On 12/09/2014 05:54 PM, Jan Kiszka wrote:
> > On 2014-12-04 03:24, Jike Song wrote:
> >> Hi all,
> >>
> >> We are pleased to announce the first release of KVMGT project. KVMGT
> is
> >> the implementation of Intel GVT-g technology, a full GPU virtualization
> >> solution. Under Intel GVT-g, a virtual GPU instance is maintained for
> >> each VM, with part of performance critical resources directly assigned.
> >> The capability of running native graphics driver inside a VM, without
> >> hypervisor intervention in performance critical paths, achieves a good
> >> balance of performance, feature, and sharing capability.
> >>
> >>
> >> KVMGT is still in the early stage:
> >>
> >> - Basic functions of full GPU virtualization works, guest can see a
> >> full-featured vGPU.
> >> We ran several 3D workloads such as lightsmark, nexuiz, urbanterror
> >> and warsow.
> >>
> >> - Only Linux guest supported so far, and PPGTT must be disabled in
> >> guest through a
> >> kernel parameter(see README.kvmgt in QEMU).
> >>
> >> - This drop also includes some Xen specific changes, which will be
> >> cleaned up later.
> >>
> >> - Our end goal is to upstream both XenGT and KVMGT, which shares
> ~90%
> >> logic for vGPU
> >> device model (will be part of i915 driver), with only difference in
> >> hypervisor
> >> specific services
> >>
> >> - insufficient test coverage, so please bear with stability issues :)
> >>
> >>
> >>
> >> There are things need to be improved, esp. the KVM interfacing part:
> >>
> >> 1 a domid was added to each KVMGT guest
> >>
> >> An ID is needed for foreground OS switching, e.g.
> >>
> >> # echo <domid> >
> /sys/kernel/vgt/control/foreground_vm
> >>
> >> domid 0 is reserved for host OS.
> >>
> >>
> >> 2 SRCU workarounds.
> >>
> >> Some KVM functions, such as:
> >>
> >> kvm_io_bus_register_dev
> >> install_new_memslots
> >>
> >> must be called *without* &kvm->srcu read-locked. Otherwise it
> >> hangs.
> >>
> >> In KVMGT, we need to register an iodev only *after* BAR
> >> registers are
> >> written by guest. That means, we already have &kvm->srcu
> hold -
> >> trapping/emulating PIO(BAR registers) makes us in such a
> condition.
> >> That will make kvm_io_bus_register_dev hangs.
> >>
> >> Currently we have to disable rcu_assign_pointer() in such
> >> functions.
> >>
> >> These were dirty workarounds, your suggestions are high
> welcome!
> >>
> >>
> >> 3 syscalls were called to access "/dev/mem" from kernel
> >>
> >> An in-kernel memslot was added for aperture, but using syscalls
> >> like
> >> open and mmap to open and access the character device
> "/dev/mem",
> >> for pass-through.
> >>
> >>
> >>
> >>
> >> The source codes(kernel, qemu as well as seabios) are available at github:
> >>
> >> git://github.com/01org/KVMGT-kernel
> >> git://github.com/01org/KVMGT-qemu
> >> git://github.com/01org/KVMGT-seabios
> >>
> >> In the KVMGT-qemu repository, there is a "README.kvmgt" to be referred.
> >>
> >>
> >>
> >> More information about Intel GVT-g and KVMGT can be found at:
> >>
> >>
> https://www.usenix.org/conference/atc14/technical-sessions/presentation/tia
> n
> >>
> >>
> http://events.linuxfoundation.org/sites/events/files/slides/KVMGT-a%20Full%2
> 0GPU%20Virtualization%20Solution_1.pdf
> >>
> >>
> >>
> >> Appreciate your comments, BUG reports, and contributions!
> >>
> >
> > There is an even increasing interest to keep KVM's in-kernel guest
> > interface as small as possible, specifically for security reasons. I'm
> > sure there are some good performance reasons to create a new in-kernel
> > device model, but I suppose those will need good evidences why things
> > are done in the way they finally should be - and not via a user-space
> > device model. This is likely not a binary decision (all userspace vs. no
> > userspace), it is more about the size and robustness of the in-kernel
> > model vs. its performance.

Thanks for explaining the background. We're not against the userspace
model if applied, but based on our analysis we figured out the in-kernel
model is the best suite, not just for performance reason, but also for
the tight couple to i915 functionalities (scheduling, interrupt, security, etc.)
and hypervisor functionalities (GPU shadow page table, etc.) which are
best handled in kernel directly. Definitely we don't want to split it just for
performance reason, w/o a functionally clear separation, because that
just creates unnecessary/messy user/kernel interfaces. And now we've
got i915 community's signal that they're willing to pick the core code
into i915 driver, which we're currently working on.

So, not to eliminate the possibility of user/kernel split, how about we first
look at those in-kernel dm changes for KVM? Then you can help judge
whether it's a reasonable change or instead there's a better option. Jike
will summarize and start the discussion in a separate thread.

> >
> > One aspect could also be important: Are there hardware improvements in
> > sight that will eventually help to reduce the in-kernel device model and
> > make the overall design even more robust? How will those changes fit
> > best into a proposed user/kernel split?
> >

I can't talk about hardware improvements publicly, but the foreseen
changes target support in kernel drivers. :-)

Thanks
Kevin
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-12-10 16:59:18

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM



On 09/12/2014 03:49, Tian, Kevin wrote:
> - Now we have XenGT/KVMGT separately maintained, and KVMGT lags
> behind XenGT regarding to features and qualities. Likely you'll continue
> see stale code (like Xen inst decoder) for some time. In the future we
> plan to maintain a single kernel repo for both, so KVMGT can share
> same quality as XenGT once KVM in-kernel dm framework is stable.
>
> - Regarding to Qemu hacks, KVMGT really doesn't have any different
> requirements as what have been discussed for GPU pass-through, e.g.
> about ISA bridge. Our implementation is based on an old Qemu repo,
> and honestly speaking not cleanly developed, because we know we
> can leverage from GPU pass-through support once it's in Qemu. At
> that time we'll leverage the same logic with minimal changes to
> hook KVMGT mgmt. APIs (e.g. create/destroy a vGPU instance). So
> we can ignore this area for now. :-)

Could the virtual device model introduce new registers in order to avoid
poking at the ISA bridge? I'm not sure that you "can leverage from GPU
pass-through support once it's in Qemu", since the Xen IGD passthrough
support is being added to a separate machine that is specific to Xen IGD
passthrough; no ISA bridge hacking will probably be allowed on the "-M
pc" and "-M q35" machine types.

Paolo

2014-12-11 00:34:11

by Tian, Kevin

[permalink] [raw]
Subject: RE: [Intel-gfx] [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM

> From: Paolo Bonzini [mailto:[email protected]]
> Sent: Thursday, December 11, 2014 12:59 AM
>
> On 09/12/2014 03:49, Tian, Kevin wrote:
> > - Now we have XenGT/KVMGT separately maintained, and KVMGT lags
> > behind XenGT regarding to features and qualities. Likely you'll continue
> > see stale code (like Xen inst decoder) for some time. In the future we
> > plan to maintain a single kernel repo for both, so KVMGT can share
> > same quality as XenGT once KVM in-kernel dm framework is stable.
> >
> > - Regarding to Qemu hacks, KVMGT really doesn't have any different
> > requirements as what have been discussed for GPU pass-through, e.g.
> > about ISA bridge. Our implementation is based on an old Qemu repo,
> > and honestly speaking not cleanly developed, because we know we
> > can leverage from GPU pass-through support once it's in Qemu. At
> > that time we'll leverage the same logic with minimal changes to
> > hook KVMGT mgmt. APIs (e.g. create/destroy a vGPU instance). So
> > we can ignore this area for now. :-)
>
> Could the virtual device model introduce new registers in order to avoid
> poking at the ISA bridge? I'm not sure that you "can leverage from GPU
> pass-through support once it's in Qemu", since the Xen IGD passthrough
> support is being added to a separate machine that is specific to Xen IGD
> passthrough; no ISA bridge hacking will probably be allowed on the "-M
> pc" and "-M q35" machine types.
>

My point is that KVMGT doesn't introduce new requirements as what's
required in IGD passthrough case, because all the hacks you see now
is to satisfy guest graphics driver's expectation. I haven't follow up the
KVM IGD passthrough progress, but if it doesn't require ISA bridge hacking
the same trick can be adopted by KVMGT too. You may know Allen is
working on driver changes to avoid causing those hacks in Qemu side.
That effort will benefit us too. So I don't think this is a KVMGT specific
issue, and we need a common solution to close this gap instead of
hacking vGPU device model alone.

Thanks
Kevin
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-12-11 01:38:26

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [ANNOUNCE][RFC] KVMGT - the implementation of Intel GVT-g(full GPU virtualization) for KVM



On 11/12/2014 01:33, Tian, Kevin wrote:
> My point is that KVMGT doesn't introduce new requirements as what's
> required in IGD passthrough case, because all the hacks you see now
> is to satisfy guest graphics driver's expectation. I haven't follow up the
> KVM IGD passthrough progress, but if it doesn't require ISA bridge hacking
> the same trick can be adopted by KVMGT too.

Right now it did require ISA bridge hacking.

> You may know Allen is
> working on driver changes to avoid causing those hacks in Qemu side.
> That effort will benefit us too.

That's good to know, thanks!

Paolo