Ok, I think I've addressed all comments so far here.
Rusty, I'd like this to go into linux-next, through your tree, and
hopefully 2.6.33. What do you think?
---
This implements vhost: a kernel-level backend for virtio,
The main motivation for this work is to reduce virtualization
overhead for virtio by removing system calls on data path,
without guest changes. For virtio-net, this removes up to
4 system calls per packet: vm exit for kick, reentry for kick,
iothread wakeup for packet, interrupt injection for packet.
This driver is pretty minimal, but it's fully functional (including
migration support interfaces), and already shows performance (especially
latency) improvement over userspace.
Some more detailed description attached to the patch itself.
The patches apply to both 2.6.32-rc6 and kvm.git. I'd like them to go
into linux-next if possible. Please comment.
Changelog from v7:
- Add note on RCU usage, mirroring this in vhost/vhost.h
- Fix locking typo noted by Eric Dumazet
- Fix warnings on 32 bit
Changelog from v6:
- review comments by Daniel Walker addressed
- checkpatch cleanup
- fix build on 32 bit
- maintainers entry corrected
Changelog from v5:
- tun support
- backends with virtio net header support (enables GSO, checksum etc)
- 32 bit compat fixed
- support indirect buffers, tx exit mitigation,
tx interrupt mitigation
- support write logging (allows migration without virtio ring code in userspace)
Changelog from v4:
- disable rx notification when have rx buffers
- addressed all comments from Rusty's review
- copy bugfixes from lguest commits:
ebf9a5a99c1a464afe0b4dfa64416fc8b273bc5c
e606490c440900e50ccf73a54f6fc6150ff40815
Changelog from v3:
- checkpatch fixes
Changelog from v2:
- Comments on RCU usage
- Compat ioctl support
- Make variable static
- Copied more idiomatic english from Rusty
Changes from v1:
- Move use_mm/unuse_mm from fs/aio.c to mm instead of copying.
- Reorder code to avoid need for forward declarations
- Kill a couple of debugging printks
Michael S. Tsirkin (3):
tun: export underlying socket
mm: export use_mm/unuse_mm to modules
vhost_net: a kernel-level virtio server
MAINTAINERS | 9 +
arch/x86/kvm/Kconfig | 1 +
drivers/Makefile | 1 +
drivers/net/tun.c | 101 ++++-
drivers/vhost/Kconfig | 11 +
drivers/vhost/Makefile | 2 +
drivers/vhost/net.c | 633 +++++++++++++++++++++++++++++
drivers/vhost/vhost.c | 970 ++++++++++++++++++++++++++++++++++++++++++++
drivers/vhost/vhost.h | 158 +++++++
include/linux/Kbuild | 1 +
include/linux/if_tun.h | 14 +
include/linux/miscdevice.h | 1 +
include/linux/vhost.h | 126 ++++++
mm/mmu_context.c | 3 +
14 files changed, 2012 insertions(+), 19 deletions(-)
create mode 100644 drivers/vhost/Kconfig
create mode 100644 drivers/vhost/Makefile
create mode 100644 drivers/vhost/net.c
create mode 100644 drivers/vhost/vhost.c
create mode 100644 drivers/vhost/vhost.h
create mode 100644 include/linux/vhost.h
Michael S. Tsirkin wrote:
> Ok, I think I've addressed all comments so far here.
> Rusty, I'd like this to go into linux-next, through your tree, and
> hopefully 2.6.33. What do you think?
I think the benchmark data is a prerequisite for merge consideration, IMO.
Do you have anything for us to look at? I think comparison that show
the following are of interest:
throughput (e.g. netperf::TCP_STREAM): guest->host, guest->host->guest,
guest->host->remote, host->remote, remote->host->guest
latency (e.g. netperf::UDP_RR): same conditions as throughput
cpu-utilization
others?
Ideally, this should be at least between upstream virtio and vhost.
Bonus points if you include venet as well.
Kind regards,
-Greg
On Wed, Nov 04, 2009 at 11:02:15AM -0500, Gregory Haskins wrote:
> Michael S. Tsirkin wrote:
> > Ok, I think I've addressed all comments so far here.
> > Rusty, I'd like this to go into linux-next, through your tree, and
> > hopefully 2.6.33. What do you think?
>
> I think the benchmark data is a prerequisite for merge consideration, IMO.
Shirley Ma was kind enough to send me some measurement results showing
how kernel level acceleration helps speed up you can find them here:
http://www.linux-kvm.org/page/VhostNet
Generally, I think that merging should happen *before* agressive
benchmarking/performance tuning: otherwise there is very substancial
risk that what is an optimization in one setup hurts performance in
another one. When code is upstream, people can bisect to debug
regressions. Another good reason is that I can stop spending time
rebasing and start profiling.
> Do you have anything for us to look at?
For guest to host, compared to latest qemu with userspace virtio
backend, latency drops by a factor of 6, bandwidth doubles, cpu
utilization drops slightly :)
> I think comparison that show the following are of interest:
>
> throughput (e.g. netperf::TCP_STREAM): guest->host, guest->host->guest,
> guest->host->remote, host->remote, remote->host->guest
>
> latency (e.g. netperf::UDP_RR): same conditions as throughput
>
> cpu-utilization
>
> others?
>
> Ideally, this should be at least between upstream virtio and vhost.
> Bonus points if you include venet as well.
And vmxnet3 :)
> Kind regards,
> -Greg
>
--
MST
Michael S. Tsirkin wrote:
> On Wed, Nov 04, 2009 at 11:02:15AM -0500, Gregory Haskins wrote:
>> Michael S. Tsirkin wrote:
>>> Ok, I think I've addressed all comments so far here.
>>> Rusty, I'd like this to go into linux-next, through your tree, and
>>> hopefully 2.6.33. What do you think?
>> I think the benchmark data is a prerequisite for merge consideration, IMO.
>
> Shirley Ma was kind enough to send me some measurement results showing
> how kernel level acceleration helps speed up you can find them here:
> http://www.linux-kvm.org/page/VhostNet
Thanks for the pointers. I will roll your latest v8 code into our test
matrix. What kernel/qemu trees do they apply to?
-Greg
On Wed, Nov 04, 2009 at 02:15:42PM -0500, Gregory Haskins wrote:
> Michael S. Tsirkin wrote:
> > On Wed, Nov 04, 2009 at 11:02:15AM -0500, Gregory Haskins wrote:
> >> Michael S. Tsirkin wrote:
> >>> Ok, I think I've addressed all comments so far here.
> >>> Rusty, I'd like this to go into linux-next, through your tree, and
> >>> hopefully 2.6.33. What do you think?
> >> I think the benchmark data is a prerequisite for merge consideration, IMO.
> >
> > Shirley Ma was kind enough to send me some measurement results showing
> > how kernel level acceleration helps speed up you can find them here:
> > http://www.linux-kvm.org/page/VhostNet
>
> Thanks for the pointers. I will roll your latest v8 code into our test
> matrix. What kernel/qemu trees do they apply to?
>
> -Greg
>
kernel 2.6.32-rc6, qemu-kvm 47e465f031fc43c53ea8f08fa55cc3482c6435c8.
You can also use my development git trees if you like.
kernel:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost
userspace:
git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu-kvm.git vhost
Please note I rebase especially userspace tree now and when.
--
MST