2011-03-31 17:31:02

by Pekka Enberg

[permalink] [raw]
Subject: [ANNOUNCE] Native Linux KVM tool

Hi all,

We’re proud to announce the native Linux KVM tool!

The goal of this tool is to provide a clean, from-scratch, lightweight
KVM host tool implementation that can boot Linux guest images (just a
hobby, won't be big and professional like QEMU) with no BIOS
dependencies and with only the minimal amount of legacy device
emulation.

Note that this is a development prototype for the time being: there's no
networking support and no graphics support, amongst other missing
essentials.

It's great as a learning tool if you want to get your feet wet in
virtualization land: it's only 5 KLOC of clean C code that can already
boot a guest Linux image.

Right now it can boot a Linux image and provide you output via a serial
console, over the host terminal, i.e. you can use it to boot a guest
Linux image in a terminal or over ssh and log into the guest without
much guest or host side setup work needed.

1. To try out the tool, clone the git repository:

git clone git://github.com/penberg/linux-kvm.git

or alternatively, if you already have a kernel source tree:

git checkout -b kvm/tool
git pull git://github.com/penberg/linux-kvm.git

2. Compile the tool:

cd tools/kvm && make

3. Download a raw userspace image:

wget http://wiki.qemu.org/download/linux-0.2.img.bz2 && bunzip2
linux-0.2.img.bz2

4. Build a kernel with CONFIG_VIRTIO_BLK=y and
CONFIG_SERIAL_8250_CONSOLE=y configuration options. Note: also make sure
you have CONFIG_EXT2_FS or CONFIG_EXT4_FS if you use the above image.

5. And finally, launch the hypervisor:

./kvm --image=linux-0.2.img --kernel=../../arch/x86/boot/bzImage

The tool has been written by Pekka Enberg, Cyrill Gorcunov, and Asias
He. Special thanks to Avi Kivity for his help on KVM internals and Ingo
Molnar for all-around support and encouragement!

See the following thread for original discussion for motivation of this
project:

http://thread.gmane.org/gmane.linux.kernel/962051/focus=962620

Pekka


2011-04-01 07:06:23

by Carsten Otte

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 31.03.2011 23:47, Alexander Graf wrote:
> Did you take a look at kuli?
Right. If you want to take kuli as a start, and GPL is acceptable for
you ;-).
If kuli seems right, I suggest to go ahead and make an upstream git
repository for it. I'd be happy to keep contributing the s390 parts to
whoever feels responsible for maintaining the tree.

cheers,
Carsten



2011-04-01 07:37:13

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Friday, April 1, 2011, Carsten Otte <[email protected]> wrote:
> On 31.03.2011 23:47, Alexander Graf wrote:
>
> Did you take a look at kuli?
>
> Right. If you want to take kuli as a start, and GPL is acceptable for you ;-).
> If kuli seems right, I suggest to go ahead and make an upstream git repository for it. I'd be happy to keep contributing the s390 parts to whoever feels responsible for maintaining the tree.
>
> cheers,
> Carsten
>
>
Cool! I personally didnt hear of kuli (thanks for the hint) but will
take a look. Carsten, Pekka is the primary maintainer, so if youre
interested to maintain s390 part, i think we are glad to accept it ;)

2011-04-01 14:26:26

by Steven Rostedt

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Thu, Mar 31, 2011 at 08:30:56PM +0300, Pekka Enberg wrote:
>
> The goal of this tool is to provide a clean, from-scratch, lightweight
> KVM host tool implementation that can boot Linux guest images (just a
> hobby, won't be big and professional like QEMU)

That line looks very familiar...

"I'm doing a (free) operating system (just a hobby, won't be big and
professional like gnu) for 386(486) AT clones."

;)

-- Steve

2011-04-02 20:38:13

by Anthony Liguori

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 03/31/2011 12:30 PM, Pekka Enberg wrote:
> Hi all,
>
> We’re proud to announce the native Linux KVM tool!

Neat!

As something of a lesson of history, I'd suggest picking a more unique
name while it's still a prototype :-)

> The goal of this tool is to provide a clean, from-scratch, lightweight
> KVM host tool implementation that can boot Linux guest images (just a
> hobby, won't be big and professional like QEMU) with no BIOS
> dependencies and with only the minimal amount of legacy device
> emulation.

I see you do provide 16-bit entry points for Linux. Are you planning on
paravirtualizing this within Linux to truly eliminate the BIOS dependency?

Regards,

Anthony Liguori

> Note that this is a development prototype for the time being: there's no
> networking support and no graphics support, amongst other missing
> essentials.
>
> It's great as a learning tool if you want to get your feet wet in
> virtualization land: it's only 5 KLOC of clean C code that can already
> boot a guest Linux image.
>
> Right now it can boot a Linux image and provide you output via a serial
> console, over the host terminal, i.e. you can use it to boot a guest
> Linux image in a terminal or over ssh and log into the guest without
> much guest or host side setup work needed.
>
> 1. To try out the tool, clone the git repository:
>
> git clone git://github.com/penberg/linux-kvm.git
>
> or alternatively, if you already have a kernel source tree:
>
> git checkout -b kvm/tool
> git pull git://github.com/penberg/linux-kvm.git
>
> 2. Compile the tool:
>
> cd tools/kvm&& make
>
> 3. Download a raw userspace image:
>
> wget http://wiki.qemu.org/download/linux-0.2.img.bz2&& bunzip2
> linux-0.2.img.bz2
>
> 4. Build a kernel with CONFIG_VIRTIO_BLK=y and
> CONFIG_SERIAL_8250_CONSOLE=y configuration options. Note: also make sure
> you have CONFIG_EXT2_FS or CONFIG_EXT4_FS if you use the above image.
>
> 5. And finally, launch the hypervisor:
>
> ./kvm --image=linux-0.2.img --kernel=../../arch/x86/boot/bzImage
>
> The tool has been written by Pekka Enberg, Cyrill Gorcunov, and Asias
> He. Special thanks to Avi Kivity for his help on KVM internals and Ingo
> Molnar for all-around support and encouragement!
>
> See the following thread for original discussion for motivation of this
> project:
>
> http://thread.gmane.org/gmane.linux.kernel/962051/focus=962620
>
> Pekka
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-04-03 06:21:39

by Ingo Molnar

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool


* Anthony Liguori <[email protected]> wrote:

> On 03/31/2011 12:30 PM, Pekka Enberg wrote:
> > Hi all,
> >
> > We’re proud to announce the native Linux KVM tool!
>
> Neat!
>
> As something of a lesson of history, I'd suggest picking a more unique name
> while it's still a prototype :-)

I disagree, i find it pretty handy and intuitive to run 'kvm ./disk.img' to
boot KVM and this particular tool name has not been taken yet either.

perf uses a similar concept: the kernel subsystem is generally called 'perf',
and the (Linux specific) user-space tool is called 'perf' as well. It makes
quite a bit of sense.

Thanks,

Ingo

2011-04-03 08:24:14

by Avi Kivity

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 03/31/2011 07:30 PM, Pekka Enberg wrote:
> Hi all,
>
> We’re proud to announce the native Linux KVM tool!

So that's where you disappeared - I was following your old repository.

> The goal of this tool is to provide a clean, from-scratch, lightweight
> KVM host tool implementation that can boot Linux guest images (just a
> hobby, won't be big and professional like QEMU) with no BIOS
> dependencies and with only the minimal amount of legacy device
> emulation.
>
> Note that this is a development prototype for the time being: there's no
> networking support and no graphics support, amongst other missing
> essentials.

Mind posting a roadmap? I would put smp support near the top. This
sort of thing has to be designed in, otherwise you wind up with a big
lock like qemu.


--
error compiling committee.c: too many arguments to function

2011-04-03 08:24:58

by Avi Kivity

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/03/2011 09:21 AM, Ingo Molnar wrote:
> * Anthony Liguori<[email protected]> wrote:
>
> > On 03/31/2011 12:30 PM, Pekka Enberg wrote:
> > > Hi all,
> > >
> > > We’re proud to announce the native Linux KVM tool!
> >
> > Neat!
> >
> > As something of a lesson of history, I'd suggest picking a more unique name
> > while it's still a prototype :-)
>
> I disagree, i find it pretty handy and intuitive to run 'kvm ./disk.img' to
> boot KVM and this particular tool name has not been taken yet either.

Some distributions install qemu-kvm as /usr/bin/kvm.

> perf uses a similar concept: the kernel subsystem is generally called 'perf',
> and the (Linux specific) user-space tool is called 'perf' as well. It makes
> quite a bit of sense.

Well, this is bound to cause confusion as the tool is yet quite immature.

--
error compiling committee.c: too many arguments to function

2011-04-03 08:52:02

by Pekka Enberg

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi Anthony,

On Sat, Apr 2, 2011 at 11:38 PM, Anthony Liguori <[email protected]> wrote:
>> The goal of this tool is to provide a clean, from-scratch, lightweight
>> KVM host tool implementation that can boot Linux guest images (just a
>> hobby, won't be big and professional like QEMU) with no BIOS
>> dependencies and with only the minimal amount of legacy device
>> emulation.
>
> I see you do provide 16-bit entry points for Linux. ?Are you planning on
> paravirtualizing this within Linux to truly eliminate the BIOS dependency?

No, we aren't planning that at the moment. We're trying to support
out-of-the-box distro kernels when possible which is why we went for
E820 emulation in the first place. The only hard requirement for
bootung userspace is CONFIG_VIRTIO_BLK but otherwise kernel binaries
should just work.

Furthermore, as the BIOS glue is really really small, I'm not sure if
we need to get rid of it completely. Do you have some scenario in mind
where paravirt solution would help?

Pekka

2011-04-03 08:53:36

by Pekka Enberg

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi,

On Sun, Apr 3, 2011 at 11:24 AM, Avi Kivity <[email protected]> wrote:
> On 04/03/2011 09:21 AM, Ingo Molnar wrote:
>>
>> * Anthony Liguori<[email protected]> ?wrote:
>>
>> > ?On 03/31/2011 12:30 PM, Pekka Enberg wrote:
>> > ?> ?Hi all,
>> > ?>
>> > ?> ?We?re proud to announce the native Linux KVM tool!
>> >
>> > ?Neat!
>> >
>> > ?As something of a lesson of history, I'd suggest picking a more unique
>> > name
>> > ?while it's still a prototype :-)
>>
>> I disagree, i find it pretty handy and intuitive to run 'kvm ./disk.img'
>> to
>> boot KVM and this particular tool name has not been taken yet either.
>
> Some distributions install qemu-kvm as /usr/bin/kvm.
>
>> perf uses a similar concept: the kernel subsystem is generally called
>> 'perf',
>> and the (Linux specific) user-space tool is called 'perf' as well. It
>> makes
>> quite a bit of sense.
>
> Well, this is bound to cause confusion as the tool is yet quite immature.

Yes, that's really unfortunate. I don't care too much what we call the
tool but I definitely agree with Ingo that 'kvm' is more discoverable
to users. Any suggestions?

Pekka

2011-04-03 09:02:23

by Alon Levy

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Thu, Mar 31, 2011 at 08:30:56PM +0300, Pekka Enberg wrote:
> Hi all,
>
> We’re proud to announce the native Linux KVM tool!
>
> The goal of this tool is to provide a clean, from-scratch, lightweight
> KVM host tool implementation that can boot Linux guest images (just a
> hobby, won't be big and professional like QEMU) with no BIOS
> dependencies and with only the minimal amount of legacy device
> emulation.
>
> Note that this is a development prototype for the time being: there's no
> networking support and no graphics support, amongst other missing
> essentials.

I've looked at how to add spice to this, the qxl device should be relatively
easy to add as it's just another pci device and you already support the virtio
block pci device. But to add the spice server library there needs to be some
simple fd and timer (i.e. select/epoll) event loop, which I see is missing. Are
you planning on adding something like that?

>
> It's great as a learning tool if you want to get your feet wet in
> virtualization land: it's only 5 KLOC of clean C code that can already
> boot a guest Linux image.
>
> Right now it can boot a Linux image and provide you output via a serial
> console, over the host terminal, i.e. you can use it to boot a guest
> Linux image in a terminal or over ssh and log into the guest without
> much guest or host side setup work needed.
>
> 1. To try out the tool, clone the git repository:
>
> git clone git://github.com/penberg/linux-kvm.git
>
> or alternatively, if you already have a kernel source tree:
>
> git checkout -b kvm/tool
> git pull git://github.com/penberg/linux-kvm.git
>
> 2. Compile the tool:
>
> cd tools/kvm && make
>
> 3. Download a raw userspace image:
>
> wget http://wiki.qemu.org/download/linux-0.2.img.bz2 && bunzip2
> linux-0.2.img.bz2
>
> 4. Build a kernel with CONFIG_VIRTIO_BLK=y and
> CONFIG_SERIAL_8250_CONSOLE=y configuration options. Note: also make sure
> you have CONFIG_EXT2_FS or CONFIG_EXT4_FS if you use the above image.
>
> 5. And finally, launch the hypervisor:
>
> ./kvm --image=linux-0.2.img --kernel=../../arch/x86/boot/bzImage
>
> The tool has been written by Pekka Enberg, Cyrill Gorcunov, and Asias
> He. Special thanks to Avi Kivity for his help on KVM internals and Ingo
> Molnar for all-around support and encouragement!
>
> See the following thread for original discussion for motivation of this
> project:
>
> http://thread.gmane.org/gmane.linux.kernel/962051/focus=962620
>
> Pekka
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-04-03 09:06:22

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/03/2011 12:53 PM, Pekka Enberg wrote:
> Hi,
>
> On Sun, Apr 3, 2011 at 11:24 AM, Avi Kivity <[email protected]> wrote:
>> On 04/03/2011 09:21 AM, Ingo Molnar wrote:
>>>
>>> * Anthony Liguori<[email protected]> wrote:
>>>
>>>> On 03/31/2011 12:30 PM, Pekka Enberg wrote:
>>>> > Hi all,
>>>> >
>>>> > We’re proud to announce the native Linux KVM tool!
>>>>
>>>> Neat!
>>>>
>>>> As something of a lesson of history, I'd suggest picking a more unique
>>>> name
>>>> while it's still a prototype :-)
>>>
>>> I disagree, i find it pretty handy and intuitive to run 'kvm ./disk.img'
>>> to
>>> boot KVM and this particular tool name has not been taken yet either.
>>
>> Some distributions install qemu-kvm as /usr/bin/kvm.
>>
>>> perf uses a similar concept: the kernel subsystem is generally called
>>> 'perf',
>>> and the (Linux specific) user-space tool is called 'perf' as well. It
>>> makes
>>> quite a bit of sense.
>>
>> Well, this is bound to cause confusion as the tool is yet quite immature.
>
> Yes, that's really unfortunate. I don't care too much what we call the
> tool but I definitely agree with Ingo that 'kvm' is more discoverable
> to users. Any suggestions?
>
> Pekka

Well, I personally do not care much either. If there a fear we might interfere
with some distribution probably we could re-name it "nkvm" (ie from Native KVM).
I've googled it and found there is no such name used yet. Hm?

--
Cyrill

2011-04-03 09:17:49

by Avi Kivity

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/03/2011 11:51 AM, Pekka Enberg wrote:
> Hi Anthony,
>
> On Sat, Apr 2, 2011 at 11:38 PM, Anthony Liguori<[email protected]> wrote:
> >> The goal of this tool is to provide a clean, from-scratch, lightweight
> >> KVM host tool implementation that can boot Linux guest images (just a
> >> hobby, won't be big and professional like QEMU) with no BIOS
> >> dependencies and with only the minimal amount of legacy device
> >> emulation.
> >
> > I see you do provide 16-bit entry points for Linux. Are you planning on
> > paravirtualizing this within Linux to truly eliminate the BIOS dependency?
>
> No, we aren't planning that at the moment. We're trying to support
> out-of-the-box distro kernels when possible which is why we went for
> E820 emulation in the first place. The only hard requirement for
> bootung userspace is CONFIG_VIRTIO_BLK but otherwise kernel binaries
> should just work.
>
> Furthermore, as the BIOS glue is really really small, I'm not sure if
> we need to get rid of it completely. Do you have some scenario in mind
> where paravirt solution would help?

It would be a easier to support the bios than implement everything it
provides in a different way. SMP support, cpu hotplug, device hotplug,
NUMA, and probably other features all rely on the bios.

--
error compiling committee.c: too many arguments to function

2011-04-03 09:59:08

by Pekka Enberg

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi Avi,

On Sun, Apr 3, 2011 at 11:23 AM, Avi Kivity <[email protected]> wrote:
>> Note that this is a development prototype for the time being: there's no
>> networking support and no graphics support, amongst other missing
>> essentials.
>
> Mind posting a roadmap? ?I would put smp support near the top. ?This sort of
> thing has to be designed in, otherwise you wind up with a big lock like
> qemu.

What are the pain points with qemu at the moment?

SMP, networking, and simpler guest to host communication from shell
are most interesting missing features for me. I'd also love to have
GPU support for X and friends.

Pekka

2011-04-03 10:01:43

by Pekka Enberg

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Sun, 2011-04-03 at 12:01 +0300, Alon Levy wrote:
> On Thu, Mar 31, 2011 at 08:30:56PM +0300, Pekka Enberg wrote:
> > Hi all,
> >
> > We’re proud to announce the native Linux KVM tool!
> >
> > The goal of this tool is to provide a clean, from-scratch, lightweight
> > KVM host tool implementation that can boot Linux guest images (just a
> > hobby, won't be big and professional like QEMU) with no BIOS
> > dependencies and with only the minimal amount of legacy device
> > emulation.
> >
> > Note that this is a development prototype for the time being: there's no
> > networking support and no graphics support, amongst other missing
> > essentials.
>
> I've looked at how to add spice to this, the qxl device should be relatively
> easy to add as it's just another pci device and you already support the virtio
> block pci device. But to add the spice server library there needs to be some
> simple fd and timer (i.e. select/epoll) event loop, which I see is missing. Are
> you planning on adding something like that?

We have kvm__start_timer() in tools/kvm/kvm.c. Can you use that as a
base for qxl?

Pekka

2011-04-03 10:11:44

by Avi Kivity

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/03/2011 12:59 PM, Pekka Enberg wrote:
> Hi Avi,
>
> On Sun, Apr 3, 2011 at 11:23 AM, Avi Kivity<[email protected]> wrote:
> >> Note that this is a development prototype for the time being: there's no
> >> networking support and no graphics support, amongst other missing
> >> essentials.
> >
> > Mind posting a roadmap? I would put smp support near the top. This sort of
> > thing has to be designed in, otherwise you wind up with a big lock like
> > qemu.
>
> What are the pain points with qemu at the moment?

It's an ugly gooball.

> SMP, networking, and simpler guest to host communication from shell
> are most interesting missing features for me.

If it is to be more than a toy, then Windows (really generic guest)
support, manageability, live migration, hotplug, etc. are all crucial.

> I'd also love to have
> GPU support for X and friends.

Should be easy to get by integrating spice (but that gives you a
remote-optimized display, not local).

--
error compiling committee.c: too many arguments to function

2011-04-03 10:15:42

by Alon Levy

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Sun, Apr 03, 2011 at 01:01:38PM +0300, Pekka Enberg wrote:
> On Sun, 2011-04-03 at 12:01 +0300, Alon Levy wrote:
> > On Thu, Mar 31, 2011 at 08:30:56PM +0300, Pekka Enberg wrote:
> > > Hi all,
> > >
> > > We’re proud to announce the native Linux KVM tool!
> > >
> > > The goal of this tool is to provide a clean, from-scratch, lightweight
> > > KVM host tool implementation that can boot Linux guest images (just a
> > > hobby, won't be big and professional like QEMU) with no BIOS
> > > dependencies and with only the minimal amount of legacy device
> > > emulation.
> > >
> > > Note that this is a development prototype for the time being: there's no
> > > networking support and no graphics support, amongst other missing
> > > essentials.
> >
> > I've looked at how to add spice to this, the qxl device should be relatively
> > easy to add as it's just another pci device and you already support the virtio
> > block pci device. But to add the spice server library there needs to be some
> > simple fd and timer (i.e. select/epoll) event loop, which I see is missing. Are
> > you planning on adding something like that?
>
> We have kvm__start_timer() in tools/kvm/kvm.c. Can you use that as a
> base for qxl?
>

Haven't looked at it close enough yet, but if I can set an arbitrary time
ahead wakeup using it then it's good enough for timers. For waking up as
a result of a socket being ready for read or write I need something else.
Maybe I'm missing something and you already have something like that.

> Pekka
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-04-03 10:17:55

by Pekka Enberg

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi Avi,

On Sun, Apr 3, 2011 at 1:11 PM, Avi Kivity <[email protected]> wrote:
>> SMP, networking, and simpler guest to host communication from shell
>> are most interesting missing features for me.
>
> If it is to be more than a toy, then Windows (really generic guest) support,
> manageability, live migration, hotplug, etc. are all crucial.

It's definitely not a toy, it's my main virtualization tool of choice
for kernel development! ;-)

The features you mention are crucial for servers but not for desktop.
I personally don't have much need for managing and live-migrating
Windows guests but if someone is interested in working on that, we're
happy to take patches!

Pekka

2011-04-03 10:22:45

by Avi Kivity

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/03/2011 01:17 PM, Pekka Enberg wrote:
> Hi Avi,
>
> On Sun, Apr 3, 2011 at 1:11 PM, Avi Kivity<[email protected]> wrote:
> >> SMP, networking, and simpler guest to host communication from shell
> >> are most interesting missing features for me.
> >
> > If it is to be more than a toy, then Windows (really generic guest) support,
> > manageability, live migration, hotplug, etc. are all crucial.
>
> It's definitely not a toy, it's my main virtualization tool of choice
> for kernel development! ;-)
>
> The features you mention are crucial for servers but not for desktop.
> I personally don't have much need for managing and live-migrating
> Windows guests but if someone is interested in working on that, we're
> happy to take patches!

Well, I'd say you do need generic guest support for desktop (but not the
other stuff I mentioned). Are you planning to add a real GUI?

--
error compiling committee.c: too many arguments to function

2011-04-03 10:23:21

by CaT

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Sun, Apr 03, 2011 at 11:53:34AM +0300, Pekka Enberg wrote:
> Yes, that's really unfortunate. I don't care too much what we call the
> tool but I definitely agree with Ingo that 'kvm' is more discoverable

Indeed. It's great for finding info on keyboard-video-mouse switches. :)

--
"A search of his car uncovered pornography, a homemade sex aid, women's
stockings and a Jack Russell terrier."
- http://www.dailytelegraph.com.au/news/wacky/indeed/story-e6frev20-1111118083480

2011-04-03 10:32:21

by Pekka Enberg

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi Avi,

On Sun, Apr 3, 2011 at 1:22 PM, Avi Kivity <[email protected]> wrote:
>> It's definitely not a toy, it's my main virtualization tool of choice
>> for kernel development! ;-)
>>
>> The features you mention are crucial for servers but not for desktop.
>> I personally don't have much need for managing and live-migrating
>> Windows guests but if someone is interested in working on that, we're
>> happy to take patches!
>
> Well, I'd say you do need generic guest support for desktop (but not the
> other stuff I mentioned). ?Are you planning to add a real GUI?

Yes, we'll hopefully add GUI at some point.

Generic guest support would be cool but like I said, I personally
don't need it because I'm mostly interested in running Linux guests.
Windows is totally uninteresting and I'm hoping we can get away with
virtio drivers for other FOSS kernels.

Pekka

2011-04-03 13:09:07

by Anthony Liguori

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/03/2011 05:11 AM, Avi Kivity wrote:
> On 04/03/2011 12:59 PM, Pekka Enberg wrote:
>> Hi Avi,
>>
>> On Sun, Apr 3, 2011 at 11:23 AM, Avi Kivity<[email protected]> wrote:
>> >> Note that this is a development prototype for the time being:
>> there's no
>> >> networking support and no graphics support, amongst other missing
>> >> essentials.
>> >
>> > Mind posting a roadmap? I would put smp support near the top.
>> This sort of
>> > thing has to be designed in, otherwise you wind up with a big lock
>> like
>> > qemu.
>>
>> What are the pain points with qemu at the moment?
>
> It's an ugly gooball.

Because it solves a lot of very difficult problems.

You could drop all of the TCG support and it'd still be an ugly gooball.

Supporting lots of different emulated hardware devices, live migration,
tons of different types of networking and image formats, etc., all adds
up over time.

>> SMP, networking, and simpler guest to host communication from shell
>> are most interesting missing features for me.
>
> If it is to be more than a toy, then Windows (really generic guest)
> support, manageability, live migration, hotplug, etc. are all crucial.

I concur that SMP is probably one of those features you need to start
with if you're designing something from scratch.

Regards,

Anthony Liguori

2011-04-03 13:19:37

by Avi Kivity

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/03/2011 04:09 PM, Anthony Liguori wrote:
> On 04/03/2011 05:11 AM, Avi Kivity wrote:
>> On 04/03/2011 12:59 PM, Pekka Enberg wrote:
>>> Hi Avi,
>>>
>>> On Sun, Apr 3, 2011 at 11:23 AM, Avi Kivity<[email protected]> wrote:
>>> >> Note that this is a development prototype for the time being:
>>> there's no
>>> >> networking support and no graphics support, amongst other missing
>>> >> essentials.
>>> >
>>> > Mind posting a roadmap? I would put smp support near the top.
>>> This sort of
>>> > thing has to be designed in, otherwise you wind up with a big
>>> lock like
>>> > qemu.
>>>
>>> What are the pain points with qemu at the moment?
>>
>> It's an ugly gooball.
>
> Because it solves a lot of very difficult problems.
>
> You could drop all of the TCG support and it'd still be an ugly gooball.
>
> Supporting lots of different emulated hardware devices, live
> migration, tons of different types of networking and image formats,
> etc., all adds up over time.

Sure, any succcesful project becomes an ugly gooball. It's almost a
compliment.

--
error compiling committee.c: too many arguments to function

2011-04-04 10:35:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool


* Pekka Enberg <[email protected]> wrote:

> > Well, this is bound to cause confusion as the tool is yet quite immature.
>
> Yes, that's really unfortunate. I don't care too much what we call the tool
> but I definitely agree with Ingo that 'kvm' is more discoverable to users.
> Any suggestions?

Well, lets keep it 'kvm' - if it stays immature it doesnt matter much. If it
improves it has the right and obvious tool name.

Thanks,

Ingo

2011-04-06 09:00:56

by Markus Armbruster

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Anthony Liguori <[email protected]> writes:

> On 04/03/2011 05:11 AM, Avi Kivity wrote:
>> On 04/03/2011 12:59 PM, Pekka Enberg wrote:
>>> Hi Avi,
>>>
>>> On Sun, Apr 3, 2011 at 11:23 AM, Avi Kivity<[email protected]> wrote:
>>> >> Note that this is a development prototype for the time being:
>>> there's no
>>> >> networking support and no graphics support, amongst other missing
>>> >> essentials.
>>> >
>>> > Mind posting a roadmap? I would put smp support near the top.
>>> This sort of
>>> > thing has to be designed in, otherwise you wind up with a big
>>> lock like
>>> > qemu.
>>>
>>> What are the pain points with qemu at the moment?
>>
>> It's an ugly gooball.
>
> Because it solves a lot of very difficult problems.

And the solutions emerged / evolved over a long time. Meanwhile, goals
shifted. It wasn't designed as user space for KVM, it got shoehorned
into that role (successfully).

It has some solutions it should have left to other tools. For instance,
it shouldn't be in the network configuration business.

> You could drop all of the TCG support and it'd still be an ugly gooball.
>
> Supporting lots of different emulated hardware devices, live
> migration, tons of different types of networking and image formats,
> etc., all adds up over time.

It does. Still, a fresh start could lead to a less ugly gooball.

>>> SMP, networking, and simpler guest to host communication from shell
>>> are most interesting missing features for me.
>>
>> If it is to be more than a toy, then Windows (really generic guest)
>> support, manageability, live migration, hotplug, etc. are all
>> crucial.
>
> I concur that SMP is probably one of those features you need to start
> with if you're designing something from scratch.

Certainly. Another one that doesn't like retrofitting is security.

2011-04-06 09:31:01

by Gleb Natapov

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Wed, Apr 06, 2011 at 10:59:45AM +0200, Markus Armbruster wrote:
> Anthony Liguori <[email protected]> writes:
>
> > On 04/03/2011 05:11 AM, Avi Kivity wrote:
> >> On 04/03/2011 12:59 PM, Pekka Enberg wrote:
> >>> Hi Avi,
> >>>
> >>> On Sun, Apr 3, 2011 at 11:23 AM, Avi Kivity<[email protected]> wrote:
> >>> >> Note that this is a development prototype for the time being:
> >>> there's no
> >>> >> networking support and no graphics support, amongst other missing
> >>> >> essentials.
> >>> >
> >>> > Mind posting a roadmap? I would put smp support near the top.
> >>> This sort of
> >>> > thing has to be designed in, otherwise you wind up with a big
> >>> lock like
> >>> > qemu.
> >>>
> >>> What are the pain points with qemu at the moment?
> >>
> >> It's an ugly gooball.
> >
> > Because it solves a lot of very difficult problems.
>
> And the solutions emerged / evolved over a long time. Meanwhile, goals
> shifted. It wasn't designed as user space for KVM, it got shoehorned
> into that role (successfully).
>
> It has some solutions it should have left to other tools. For instance,
> it shouldn't be in the network configuration business.
>
> > You could drop all of the TCG support and it'd still be an ugly gooball.
> >
> > Supporting lots of different emulated hardware devices, live
> > migration, tons of different types of networking and image formats,
> > etc., all adds up over time.
>
> It does. Still, a fresh start could lead to a less ugly gooball.
>
> >>> SMP, networking, and simpler guest to host communication from shell
> >>> are most interesting missing features for me.
> >>
> >> If it is to be more than a toy, then Windows (really generic guest)
> >> support, manageability, live migration, hotplug, etc. are all
> >> crucial.
> >
> > I concur that SMP is probably one of those features you need to start
> > with if you're designing something from scratch.
>
> Certainly. Another one that doesn't like retrofitting is security.
And migration :)

--
Gleb.

2011-04-06 09:33:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool


* Avi Kivity <[email protected]> wrote:

> Sure, any succcesful project becomes an ugly gooball. It's almost a
> compliment.

I disagree strongly with that sentiment and there's several good counter
examples:

- the Git project is also highly successful and is kept very clean (and has a
project size comparable to Qemu)

- the Linux kernel is also very clean in all areas i care about and has most
of its ugliness stuffed into drivers/staging/ (and has a project size more
than an order of magnitude larger than Qemu).

In fact i claim the exact opposite: certain types of projects can only grow
beyond a certain size and stay healthy if they are *not* ugly gooballs.

Examples: X11 and GCC - both were struggling for years to break through magic
invisible barriers of growth and IMHO a lot of it had to do with the lack of
code (and development model) cleanliness.

So no, your kind of cynical, defeatist sentiment about code quality is by no
means true in my experience. Projects become ugly gooballs once maintainers
stop caring enough.

Thanks,

Ingo

2011-04-06 09:37:23

by Gleb Natapov

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Wed, Apr 06, 2011 at 11:33:33AM +0200, Ingo Molnar wrote:
> So no, your kind of cynical, defeatist sentiment about code quality is by no
> means true in my experience. Projects become ugly gooballs once maintainers
> stop caring enough.
>
In case of Qemu it was other way around. Maintainers started caring to
late.

--
Gleb.

2011-04-06 09:46:32

by Ingo Molnar

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool


* Gleb Natapov <[email protected]> wrote:

> On Wed, Apr 06, 2011 at 11:33:33AM +0200, Ingo Molnar wrote:
>
> > So no, your kind of cynical, defeatist sentiment about code quality is by
> > no means true in my experience. Projects become ugly gooballs once
> > maintainers stop caring enough.
>
> In case of Qemu it was other way around. Maintainers started caring too late.

Nah, i do not think it's ever too late to care.

Example: arch/i386 - arch/x86_64/ was very messy for many, many years and we
turned it around and can be proud of arch/x86/ today - but i guess i'm somewhat
biased there ;-)

In my experience it's entirely possible to turn a messy gooball into something
you can be proud of - it's all reversible. Start small, with the core bits you
care about most - then extend those concepts to other areas of the code base,
gradually. There might be subsystems that will never turn around before
becoming obsolete - that's not a big problem.

Thanks,

Ingo

2011-04-06 09:49:43

by Avi Kivity

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/06/2011 12:46 PM, Ingo Molnar wrote:
> * Gleb Natapov<[email protected]> wrote:
>
> > On Wed, Apr 06, 2011 at 11:33:33AM +0200, Ingo Molnar wrote:
> >
> > > So no, your kind of cynical, defeatist sentiment about code quality is by
> > > no means true in my experience. Projects become ugly gooballs once
> > > maintainers stop caring enough.
> >
> > In case of Qemu it was other way around. Maintainers started caring too late.
>
> Nah, i do not think it's ever too late to care.
>
> Example: arch/i386 - arch/x86_64/ was very messy for many, many years and we
> turned it around and can be proud of arch/x86/ today - but i guess i'm somewhat
> biased there ;-)
>
> In my experience it's entirely possible to turn a messy gooball into something
> you can be proud of - it's all reversible. Start small, with the core bits you
> care about most - then extend those concepts to other areas of the code base,
> gradually. There might be subsystems that will never turn around before
> becoming obsolete - that's not a big problem.

That is what we're trying to do with qemu.

--
error compiling committee.c: too many arguments to function

2011-04-06 09:52:27

by Gleb Natapov

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Wed, Apr 06, 2011 at 11:46:12AM +0200, Ingo Molnar wrote:
>
> * Gleb Natapov <[email protected]> wrote:
>
> > On Wed, Apr 06, 2011 at 11:33:33AM +0200, Ingo Molnar wrote:
> >
> > > So no, your kind of cynical, defeatist sentiment about code quality is by
> > > no means true in my experience. Projects become ugly gooballs once
> > > maintainers stop caring enough.
> >
> > In case of Qemu it was other way around. Maintainers started caring too late.
>
> Nah, i do not think it's ever too late to care.
>
> Example: arch/i386 - arch/x86_64/ was very messy for many, many years and we
> turned it around and can be proud of arch/x86/ today - but i guess i'm somewhat
> biased there ;-)
>
> In my experience it's entirely possible to turn a messy gooball into something
> you can be proud of - it's all reversible. Start small, with the core bits you
> care about most - then extend those concepts to other areas of the code base,
> gradually. There might be subsystems that will never turn around before
> becoming obsolete - that's not a big problem.
>
I do not disagree, but then qemu has a chance because maintainers do
care now, but not about all bits. And there should be willingness to
drop bits nobody cares about and I do not see this yet.

--
Gleb.

2011-04-06 10:14:12

by Olivier Galibert

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Wed, Apr 06, 2011 at 11:33:33AM +0200, Ingo Molnar wrote:
> Examples: X11 and GCC - both were struggling for years to break through magic
> invisible barriers of growth and IMHO a lot of it had to do with the lack of
> code (and development model) cleanliness.

A large part of what's killing X11 and qemu is the decomposition in
multiple trees and the requirement that every version must work with
every other version.

For X11 you have:
- the server
- the protocol headers
- the individual 2D drivers
- libdrm
- the kernel
- mesa
- the video decoding driver/libs

For qemu you have:
- qemu
- qemu-kvm
- the kernel
- libvirt
- seabios

Any reaching change ends up hitting most of the trees, with all to
coordination that means. And in any case you're supposed to handle
any version of the other components.

Virtualbox works in part because they provide everything, from the
dhcp server to the kernel modules. The NVidia closed-source drivers
works in part because they provide everything, from the glx interface
to the kernel modules. And both of them refuse to start if everything
is not in lockstep. Meanwhile the open source idealists are a way
smaller number and expect to support every combination of everything
with everything as long as it appeared one day in a release, no matter
how buggy or badly designed it was.

So less people, additional hurdles that experience has shown not to be
a necessity (people cope with lockstep updates, vbox and nvidia prove
it), and one wonders why the "open" solutions end up way inferior?

OG.

2011-04-06 10:56:10

by Ingo Molnar

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool


* Olivier Galibert <[email protected]> wrote:

> On Wed, Apr 06, 2011 at 11:33:33AM +0200, Ingo Molnar wrote:
> > Examples: X11 and GCC - both were struggling for years to break through magic
> > invisible barriers of growth and IMHO a lot of it had to do with the lack of
> > code (and development model) cleanliness.
>
> A large part of what's killing X11 and qemu is the decomposition in
> multiple trees and the requirement that every version must work with
> every other version.
>
> For X11 you have:
> - the server
> - the protocol headers
> - the individual 2D drivers
> - libdrm
> - the kernel
> - mesa
> - the video decoding driver/libs
>
> For qemu you have:
> - qemu
> - qemu-kvm
> - the kernel
> - libvirt
> - seabios
>
> Any reaching change ends up hitting most of the trees, with all to
> coordination that means. And in any case you're supposed to handle
> any version of the other components.

Splitting up a project into several trees, often unnecessarily, is a
self-inflicted wound really.

Smaller projects can hurt from that as well: a well-known example is oprofile.

Pointing to the stupidity of overmodularization is one of my pet peeves, i
consider it a "development model cleanliness" bug that needlessly exposes OSS
projects to the negative effects of technical and social forks and
complicates/shackles them. I flamed^W argued about it before, in the KVM / Qemu
context as well.

There are good examples of successful, highly integrated projects:

- FreeBSD - it has achieved Linux-alike results with a fraction of the
manpower

- Android - on the desktop it has achieved much more than Linux, with a
fraction of the manpower

And that concept can be brought to its logical conclusion: i think it's only a
matter of time until someone takes the Linux kernel, integrates klibc and a
toolchain into it with some good initial userspace and goes wild with that
concept, as a single, sane, 100% self-hosting and self-sufficient OSS project,
tracking the release schedule of the Linux kernel.

It might not happen on PC hardware (which is *way* too OSS-hostile), but it
will eventually happen IMO. It's the eventual OSS killer feature and weirdly
enough no-one has tried it yet. (Android comes close in a sense)

Thanks,

Ingo

2011-04-08 02:04:15

by Anthony Liguori

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/06/2011 05:55 AM, Ingo Molnar wrote:
> Splitting up a project into several trees, often unnecessarily, is a
> self-inflicted wound really.

There's certainly something to this but the bit that surprises me is the
approaching being taken.

Why not take perf and all the other tools, stick them in their own git
repos, and use git submodules to track them in the main kernel source
tree. It seems like a nicer way to separate git histories while still
getting the benefits of a shared repository.

Regards,

Anthony Liguori

2011-04-08 02:14:10

by Anthony Liguori

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/06/2011 04:33 AM, Ingo Molnar wrote:
> * Avi Kivity<[email protected]> wrote:
>
>> Sure, any succcesful project becomes an ugly gooball. It's almost a
>> compliment.
> I disagree strongly with that sentiment and there's several good counter
> examples:
>
> - the Git project is also highly successful and is kept very clean (and has a
> project size comparable to Qemu)
>
> - the Linux kernel is also very clean in all areas i care about and has most
> of its ugliness stuffed into drivers/staging/ (and has a project size more
> than an order of magnitude larger than Qemu).
>
> In fact i claim the exact opposite: certain types of projects can only grow
> beyond a certain size and stay healthy if they are *not* ugly gooballs.
>
> Examples: X11 and GCC - both were struggling for years to break through magic
> invisible barriers of growth and IMHO a lot of it had to do with the lack of
> code (and development model) cleanliness.

So what makes Native Linux KVM tool so much cleaner?

As far as I can tell, it's architecturally identical to QEMU. In fact,
it's reminiscent of QEMU from about 5 years ago. It makes the same
mistakes of having a linear I/O dispatch model, makes no attempt to
enable a threaded execution model, ignores thing like migration and
manageability.

> So no, your kind of cynical, defeatist sentiment about code quality is by no
> means true in my experience. Projects become ugly gooballs once maintainers
> stop caring enough.

It think sweeping generalizations are always wrong :-)

I struggle with a lot of things in QEMU. Compatibility is just a
nightmare to maintain because so many of the previous interfaces and
functionality were so poorly thought through.

If someone was going to seriously go about doing something like this, a
better approach would be to start with QEMU and remove anything non-x86
and all of the UI/command line/management bits and start there.

There's nothing more I'd like to see than a viable alternative to QEMU
but ignoring any of the architectural mistakes in QEMU and repeating
them in a new project isn't going to get there.

Too much effort in QEMU goes into working around previous mistakes.
That doesn't mean that QEMU doesn't have a lot of useful bits in it and
hasn't figured out a lot of good ways to do things.

Regards,

Anthony Liguori

> Thanks,
>
> Ingo

2011-04-08 05:14:25

by Pekka Enberg

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi Anthony,

On Fri, Apr 8, 2011 at 5:14 AM, Anthony Liguori <[email protected]> wrote:
> If someone was going to seriously go about doing something like this, a
> better approach would be to start with QEMU and remove anything non-x86 and
> all of the UI/command line/management bits and start there.
>
> There's nothing more I'd like to see than a viable alternative to QEMU but
> ignoring any of the architectural mistakes in QEMU and repeating them in a
> new project isn't going to get there.

Hey, feel free to help out! ;-)

I don't agree that a working 2500 LOC program is 'repeating the same
architectural mistakes' as QEMU. I hope you realize that we've gotten
here with just three part-time hackers working from their proverbial
basements. So what you call mistakes, we call features for the sake of
simplicity.

I also don't agree with this sentiment that unless we have SMP,
migration, yadda yadda yadda, now, it's impossible to change that in
the future. It ignores the fact that this is exactly how the Linux
kernel evolved and the fact that we're aggressively trying to keep the
code size as small and tidy as possible so that changing things is as
easy as possible.

I've looked at QEMU sources over the years and especially over the
past year and I think you might be way too familiar with its inner
workings to see how complex (even the core code) has become for
someone who isn't familiar with it. I think it has to do with lots of
indirection and code cleanliness issues (and I think that was the case
even before KVM came into the picture). So I don't agree at all that
taking QEMU as a starting point would make things any easier. (That
is, unless someone intimately familiar with QEMU does it.)

On Fri, Apr 8, 2011 at 5:14 AM, Anthony Liguori <[email protected]> wrote:
> Too much effort in QEMU goes into working around previous mistakes. ?That
> doesn't mean that QEMU doesn't have a lot of useful bits in it and hasn't
> figured out a lot of good ways to do things.

Completely agreed.

Pekka

2011-04-08 06:19:58

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Fri, Apr 8, 2011 at 9:14 AM, Pekka Enberg <[email protected]> wrote:
> Hi Anthony,
>
> On Fri, Apr 8, 2011 at 5:14 AM, Anthony Liguori <[email protected]> wrote:
>> If someone was going to seriously go about doing something like this, a
>> better approach would be to start with QEMU and remove anything non-x86 and
>> all of the UI/command line/management bits and start there.
>>
>> There's nothing more I'd like to see than a viable alternative to QEMU but
>> ignoring any of the architectural mistakes in QEMU and repeating them in a
>> new project isn't going to get there.
>
> Hey, feel free to help out! ;-)
>

Yeah, helping would be great!

2011-04-08 06:51:12

by Pekka Enberg

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi Takuya!

On Fri, Apr 8, 2011 at 9:47 AM, Takuya Yoshikawa
<[email protected]> wrote:
> Is it possible to find the code maintenance policy on a project site
> or somewhere? ?-- for both short run and long run.
>
> I may get some interest in using this tool for my debugging/testing/
> self-educational porpuses, but cannot know what I can do/expect.

Heh, it's all pretty straight-forward. Fetch the sources from this tree:

git clone git://github.com/penberg/linux-kvm.git

Find something interesting to hack on and when you have something you
want integrated send patches to [email protected] and CC this list.
That's it!

In the long run, we hope to live in the main kernel tree under
tools/kvm and be part of the regular kernel release cycle.

Pekka

2011-04-08 07:07:54

by Takuya Yoshikawa

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

> > I may get some interest in using this tool for my debugging/testing/
> > self-educational porpuses, but cannot know what I can do/expect.
>
> Heh, it's all pretty straight-forward. Fetch the sources from this tree:
>
> git clone git://github.com/penberg/linux-kvm.git
>
> Find something interesting to hack on and when you have something you
> want integrated send patches to [email protected] and CC this list.
> That's it!

Thank you for your answer!
Actually, I knew how to get the code because I am checking most of emails
on this ML.

What I wanted to know is what I can do with this tool in the future.

But OK, I got it that nothing is strictly determined at the moment about
what features can/should be integrated.

Takuya

>
> In the long run, we hope to live in the main kernel tree under
> tools/kvm and be part of the regular kernel release cycle.
>
> Pekka

2011-04-08 07:10:39

by Takuya Yoshikawa

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi!

> Hi Anthony,
>
> On Fri, Apr 8, 2011 at 5:14 AM, Anthony Liguori <[email protected]> wrote:
> > If someone was going to seriously go about doing something like this, a
> > better approach would be to start with QEMU and remove anything non-x86 and
> > all of the UI/command line/management bits and start there.
> >
> > There's nothing more I'd like to see than a viable alternative to QEMU but
> > ignoring any of the architectural mistakes in QEMU and repeating them in a
> > new project isn't going to get there.
>
> Hey, feel free to help out! ;-)
>
> I don't agree that a working 2500 LOC program is 'repeating the same
> architectural mistakes' as QEMU. I hope you realize that we've gotten
> here with just three part-time hackers working from their proverbial
> basements. So what you call mistakes, we call features for the sake of
> simplicity.
>
> I also don't agree with this sentiment that unless we have SMP,
> migration, yadda yadda yadda, now, it's impossible to change that in
> the future. It ignores the fact that this is exactly how the Linux
> kernel evolved and the fact that we're aggressively trying to keep the
> code size as small and tidy as possible so that changing things is as
> easy as possible.

Is it possible to find the code maintenance policy on a project site
or somewhere? -- for both short run and long run.

I may get some interest in using this tool for my debugging/testing/
self-educational porpuses, but cannot know what I can do/expect.

Takuya
For me, both QEMU and Native Linux KVM tool may be useful! :)
But it is, probably I guess, for different porposes.


>
> I've looked at QEMU sources over the years and especially over the
> past year and I think you might be way too familiar with its inner
> workings to see how complex (even the core code) has become for
> someone who isn't familiar with it. I think it has to do with lots of
> indirection and code cleanliness issues (and I think that was the case
> even before KVM came into the picture). So I don't agree at all that
> taking QEMU as a starting point would make things any easier. (That
> is, unless someone intimately familiar with QEMU does it.)

--
Takuya Yoshikawa <[email protected]>

2011-04-08 07:40:43

by Jan Kiszka

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 2011-04-08 07:14, Pekka Enberg wrote:
> Hi Anthony,
>
> On Fri, Apr 8, 2011 at 5:14 AM, Anthony Liguori <[email protected]> wrote:
>> If someone was going to seriously go about doing something like this, a
>> better approach would be to start with QEMU and remove anything non-x86 and
>> all of the UI/command line/management bits and start there.
>>
>> There's nothing more I'd like to see than a viable alternative to QEMU but
>> ignoring any of the architectural mistakes in QEMU and repeating them in a
>> new project isn't going to get there.
>
> Hey, feel free to help out! ;-)
>
> I don't agree that a working 2500 LOC program is 'repeating the same
> architectural mistakes' as QEMU. I hope you realize that we've gotten
> here with just three part-time hackers working from their proverbial
> basements. So what you call mistakes, we call features for the sake of
> simplicity.
>
> I also don't agree with this sentiment that unless we have SMP,
> migration, yadda yadda yadda, now, it's impossible to change that in
> the future. It ignores the fact that this is exactly how the Linux
> kernel evolved and the fact that we're aggressively trying to keep the
> code size as small and tidy as possible so that changing things is as
> easy as possible.

I agree that it's easy to change 2kSomething LOC for this. But if you
now wait too long designing in essential features like SMP, a scalable
execution model, and - very important - portability (*), it can get
fairly painful to fix such architectural deficits later on. How long did
it take for Linux to overcome the BKL? QEMU is in the same unfortunate
position.

Jan

(*) I would consider Anthony's idea to drop anything !=x86 a mistake
given where KVM is moving to, today on PPC, tomorrow likely on ARM -
just to name two examples.

--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

2011-04-08 08:27:20

by Pekka Enberg

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi Jan,

On Fri, 2011-04-08 at 09:39 +0200, Jan Kiszka wrote:
> I agree that it's easy to change 2kSomething LOC for this. But if you
> now wait too long designing in essential features like SMP, a scalable
> execution model, and - very important - portability (*), it can get
> fairly painful to fix such architectural deficits later on. How long did
> it take for Linux to overcome the BKL? QEMU is in the same unfortunate
> position.

Yup, and we're taking your feedback seriously (and are thankful for
it!). We're hoping to look at SMP in the near future - help is
appreciated!

Pekka

2011-04-08 09:12:17

by Jan Kiszka

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 2011-04-08 10:27, Pekka Enberg wrote:
> Hi Jan,
>
> On Fri, 2011-04-08 at 09:39 +0200, Jan Kiszka wrote:
>> I agree that it's easy to change 2kSomething LOC for this. But if you
>> now wait too long designing in essential features like SMP, a scalable
>> execution model, and - very important - portability (*), it can get
>> fairly painful to fix such architectural deficits later on. How long did
>> it take for Linux to overcome the BKL? QEMU is in the same unfortunate
>> position.
>
> Yup, and we're taking your feedback seriously (and are thankful for
> it!). We're hoping to look at SMP in the near future - help is
> appreciated!

Honestly, I do not yet see a major advantage for us to invest here
instead of / in addition to continuing to improve QEMU. We've spend
quite some effort on the latter with IMO noteworthy results. Porting
over qemu-kvm to upstream was and still is among those efforts. We (*)
are "almost done". :)

Just one example: Despite QEMU's current deficits, I just have add a
handful of (ad-hoc) patches to turn it into a (soft) real-time
hypervisor, and that also for certain non-Linux guests. Your approach is
yet man years of development and stabilization effort away from getting
close to such a level.

Don't want to discourage you or other contributors. I wish you that this
approach can gather the critical mass and momentum to make it a real
alternative, at least for a subset of use cases. We will surely keep an
eye on it and re-assess its pros&cons as it progresses.

Jan

(*) the QEMU & KVM community

--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

2011-04-08 09:32:26

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Fri, Apr 8, 2011 at 1:11 PM, Jan Kiszka <[email protected]> wrote:
> On 2011-04-08 10:27, Pekka Enberg wrote:
>> Hi Jan,
>>
>> On Fri, 2011-04-08 at 09:39 +0200, Jan Kiszka wrote:
>>> I agree that it's easy to change 2kSomething LOC for this. But if you
>>> now wait too long designing in essential features like SMP, a scalable
>>> execution model, and - very important - portability (*), it can get
>>> fairly painful to fix such architectural deficits later on. How long did
>>> it take for Linux to overcome the BKL? QEMU is in the same unfortunate
>>> position.
>>
>> Yup, and we're taking your feedback seriously (and are thankful for
>> it!). We're hoping to look at SMP in the near future - help is
>> appreciated!
>
> Honestly, I do not yet see a major advantage for us to invest here
> instead of / in addition to continuing to improve QEMU. We've spend
> quite some effort on the latter with IMO noteworthy results. Porting
> over qemu-kvm to upstream was and still is among those efforts. We (*)
> are "almost done". :)
>
> Just one example: Despite QEMU's current deficits, I just have add a
> handful of (ad-hoc) patches to turn it into a (soft) real-time
> hypervisor, and that also for certain non-Linux guests. Your approach is
> yet man years of development and stabilization effort away from getting
> close to such a level.
>
> Don't want to discourage you or other contributors. I wish you that this
> approach can gather the critical mass and momentum to make it a real
> alternative, at least for a subset of use cases. We will surely keep an
> eye on it and re-assess its pros&cons as it progresses.
>
> Jan
>
> (*) the QEMU & KVM community
>
> --
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
>

It seems there is a misunderstanding. KVM-tool is quite far from been KVM
replacement (if ever). And what we're doing -- extremely tiny/small HV which
would help us to debug/test kernel code.

So I personally think of two scenarios:

1) QEMU eventually get merged upstream and kvm-tool remains small and
tiny example of
how to do /dev/kvm ioctls with some positive (I hope) results. Or
maybe kvm-tool gets dropped
since nobody needs it, this is possible too of course.

2) kvm-tool silently sit in tools/kvm while qemu remains on separated
repo. Both go own
ways. Not a pleasant scenario but still possible.

And we don't consider kvm-tool as a qemu competitor by any means. It
simply different
weight categories.

And of course we're glad to get any feedback (and positive and
especially negative).
Pointing out that SMP might be such a problem made us scratching the head ;)

2011-04-08 10:43:10

by Jan Kiszka

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 2011-04-08 11:32, Cyrill Gorcunov wrote:
> It seems there is a misunderstanding. KVM-tool is quite far from been KVM
> replacement (if ever). And what we're doing -- extremely tiny/small HV which
> would help us to debug/test kernel code.

I think your core team may have this vision, but my impression is that
some people here think much further.

Also note that even for guest debugging some fairly essential features
are missing yet. The gdbstub is among them and the most prominent one.

>
> So I personally think of two scenarios:
>
> 1) QEMU eventually get merged upstream and kvm-tool remains small and
> tiny example of
> how to do /dev/kvm ioctls with some positive (I hope) results. Or
> maybe kvm-tool gets dropped
> since nobody needs it, this is possible too of course.
>
> 2) kvm-tool silently sit in tools/kvm while qemu remains on separated
> repo. Both go own
> ways. Not a pleasant scenario but still possible.

For me the separate tree thing is not that important as long as KVM
developers continue to hack on both sides (which most of us do).

>
> And we don't consider kvm-tool as a qemu competitor by any means. It
> simply different
> weight categories.

Long-term, IMHO, kvm-tool either has to cover at least one use case qemu
is not interested in or it has to be noticeably better in one domain.
Just being a small demo that has to be maintained _in_addition_ to qemu
/wrt KVM ABI changes will make it suffer quickly. Right now x86 has
reached a rather calm period in this regard, but PPC e.g. is about to
enter the same stormy times we had with x86 in the past years. We've
been through duplicate userland maintenance phase with qemu-kvm vs.
upstream qemu for far too long, and it was a real pain (it still is, but
the last duplicate bits should disappear before qemu-0.15).

>
> And of course we're glad to get any feedback (and positive and
> especially negative).
> Pointing out that SMP might be such a problem made us scratching the head ;)

One big advantage of qemu is that it can nicely reproduce tricky
concurrency issues (not all but many) as it provides true SMP support.
We've successfully used this several times for debugging weird kernel
and driver issues in the past 4 years.

So I personally have no use case for kvm-tool that qemu(-kvm) wouldn't
already solve, generally in a more advance way. That may explain my
skepticism. :)

Jan

--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

2011-04-08 12:27:30

by Alexander Graf

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool


On 08.04.2011, at 12:42, Jan Kiszka wrote:

> On 2011-04-08 11:32, Cyrill Gorcunov wrote:
>> It seems there is a misunderstanding. KVM-tool is quite far from been KVM
>> replacement (if ever). And what we're doing -- extremely tiny/small HV which
>> would help us to debug/test kernel code.
>
> I think your core team may have this vision, but my impression is that
> some people here think much further.

I tend to agree. The core team seems to write this as a helping aid of learning the platform and getting to know KVM. I really like that approach :).

However, if it's meant to be a "toy" (and I don't mean this negatively in any way), it really should be declared as such. Calling it "kvm" for example would be a huge mistake in that case.

Either way, I like the idea of having a second user space available for x86. Even if it just means that it verifies that the documentation is correct :).


Alex

2011-04-08 12:33:17

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Friday, April 8, 2011, Alexander Graf <[email protected]> wrote:
>
> On 08.04.2011, at 12:42, Jan Kiszka wrote:
>
>> On 2011-04-08 11:32, Cyrill Gorcunov wrote:
>>> It seems there is a misunderstanding. KVM-tool is quite far from been KVM
>>> replacement (if ever). And what we're doing -- extremely tiny/small HV which
>>> would help us to debug/test kernel code.
>>
>> I think your core team may have this vision, but my impression is that
>> some people here think much further.
>
> I tend to agree. The core team seems to write this as a helping aid of learning the platform and getting to know KVM. I really like that approach :).
>
> However, if it's meant to be a "toy" (and I don't mean this negatively in any way), it really should be declared as such. Calling it "kvm" for example would be a huge mistake in that case.
>
> Either way, I like the idea of having a second user space available for x86. Even if it just means that it verifies that the documentation is correct :).
>
>
> Alex
>
>

If we manage to make kvm-tool mature i believe anyone will win in such
case. The annonce stated clear the kvm-tool relation to qemu.

Of course we have great planes tho :)

2011-04-08 14:00:48

by Anthony Liguori

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/08/2011 12:14 AM, Pekka Enberg wrote:
> Hey, feel free to help out! ;-)
>
> I don't agree that a working 2500 LOC program is 'repeating the same
> architectural mistakes' as QEMU. I hope you realize that we've gotten
> here with just three part-time hackers working from their proverbial
> basements. So what you call mistakes, we call features for the sake of
> simplicity.

And by all means, it's a good accomplishment.

But the mistakes I'm referring to aren't missing bits of code. It's
that the current code makes really bad assumptions.

An example is ioport_ops. This maps directly to
ioport_{read,write}_table in QEMU. Then you use ioport__register() to
register entries in this table similar register_ioport_{read,write}() in
QEMU.

The use of a struct is a small improvement but the fundamental design is
flawed because it models a view of hardware where all devices are
directly connected to the CPU. This is not how hardware works at all.

On the PC QEMU tries to emulate, a PIO operation flows from the CPU to
the i440fx. The i440fx will do the first level of decoding treating the
PCI host controller ports specially and then posting any I/Os in the PCI
port range to the PCI bus. If no device selects these ports, or the
ports fall into the non-PCI range, the I/O request is then posted to the
PIIX3.

The PIIX3 will handle a good chunk of the I/O requests (via it's Super
I/O chipset) and the remainder will be posted to the ISA bus. One or
more ISA devices may then react to these posted I/O operation.

Really, having a flat table doesn't make sense. You should just send
everything to an i440fx directly. Then the i440fx should decode what it
can, and send it to the next level, and so forth.

You can get 90% of the way to working device model without modelling
this type of flow, but you hit a wall pretty quickly as it's not unusual
for PCI controllers to manipulate I/O requests in some fashion
(particularly on non-x86 platforms). If you treat everything as
directly attached to the CPU, it's impossible to model this.

Likewise, the same flow is true in the opposite direction. You use
guest_flat_to_host() which assumes a linear mapping of guest memory to
host memory. We used to do that too in QEMU (phys_ram_base + X). It
took a long time to get rid of that assumption in QEMU.

There are multiple problems with this sort of assumption. The first is
that you treat all devices as being directly attached to the memory
controller. As with I/O instruction dispatch, this is not the case, and
there are many PCI controllers that will munge these accesses (think
IOMMU, for instance). The second is you assume that you're not doing
I/O to device memory, but this does happen in practice. The
cpu_physical_memory_rw() API is careful to support cases where you're
writing data to I/O memory.

The other big problem here is that if you have open access to guest
memory like this, you cannot easily track dirty information. Userspace
accesses to guest memory will not result in KVM updating the guest dirty
bitmap. You can add another API to explicitly set dirty bits (and
that's exactly what we did a few years ago) but then you'll get
extremely subtle bugs in migration if you're missing a dirty update
somewhere. This is exactly how our API evolved in QEMU.

As I said earlier, there are very good reasons we do the things we do in
QEMU. We're a large code base and there's far too much of the code base
that noone cares about enough but that users are happy with. It's far
too hard to make broad sweeping changes right now (although that's
something we're trying to improve).

But I'd strongly suggest taking some of the advise being offered here.
Don't ignore the hard problems to start out with because as the code
base grows, it'll become more difficult to fix those. That's not to say
that you need to implement migration tomorrow, but at least keep the
constraints in mind and make sure that you're designing interfaces that
let you do things like keep an updated dirty bitmap when you do memory
accesses in userspace.

> I also don't agree with this sentiment that unless we have SMP,
> migration, yadda yadda yadda, now, it's impossible to change that in
> the future. It ignores the fact that this is exactly how the Linux
> kernel evolved

Over the course of 20 years. By my count, we still have another decade
of refactoring before I can get on top of my ivory tower and call every
other project terrible.

> and the fact that we're aggressively trying to keep the
> code size as small and tidy as possible so that changing things is as
> easy as possible.
>
> I've looked at QEMU sources over the years and especially over the
> past year and I think you might be way too familiar with its inner
> workings to see how complex (even the core code) has become for
> someone who isn't familiar with it.

I have no doubts about the complexity of QEMU. But the 'goo' factor is
not due to complexity, it's due to the fact that there's a lot of code
that basically needs to be removed. But removing features from an
existing project is never a popular thing to do particularly when the
work well enough for a lot of people.

Regards,

Anthony Liguori

2011-04-08 14:39:47

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Fri, Apr 08, 2011 at 01:32:24PM +0400, Cyrill Gorcunov wrote:
>
> It seems there is a misunderstanding. KVM-tool is quite far from been KVM
> replacement (if ever). And what we're doing -- extremely tiny/small HV which
> would help us to debug/test kernel code.

If that's true, then perhaps the command-line invocation shouldn't be
named "kvm"? The collision on the name of executable that claims that
it will replace the kvm shipped in qemu seems to make the claim quite
clearly that it's going to replace qemu's kvm in short order?

- Ted

2011-04-08 16:00:39

by Scott Wood

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Thu, 7 Apr 2011 21:14:06 -0500
Anthony Liguori <[email protected]> wrote:

> If someone was going to seriously go about doing something like this, a
> better approach would be to start with QEMU and remove anything non-x86
> and all of the UI/command line/management bits and start there.
>
> There's nothing more I'd like to see than a viable alternative to QEMU
> but ignoring any of the architectural mistakes in QEMU and repeating
> them in a new project isn't going to get there.

Supporting only a single architecture sounds like a significant
architectural mistake... only x86 deserves clean code?

-Scott

2011-04-08 19:21:29

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

Hi Anthony,

On Fri, Apr 08, 2011 at 09:00:43AM -0500, Anthony Liguori wrote:
> An example is ioport_ops. This maps directly to
> ioport_{read,write}_table in QEMU. Then you use ioport__register() to
> register entries in this table similar register_ioport_{read,write}() in
> QEMU.
>
> The use of a struct is a small improvement but the fundamental design is
> flawed because it models a view of hardware where all devices are
> directly connected to the CPU. This is not how hardware works at all.

Not sure if I've the whole picture on this but I see no answer to your
email and I found your remark above the most interesting. This is
because I thought the whole point of a native kvm tool was to go all
the paravirt way to provide max performance and maybe also depend on
vhost as much as possible.

I mean if we have to care to emulate hardware _again_ and end up
replicating qemu (with the only exception of TCG) I don't see an need
of an alternative userland, let's not understimate how qemu is already
mature and good to emulate real hardware. I thought the whole point
was to exactly avoid any complaint like "this is not how the hardware
works" and focus only to optimize for smp and max scalability and
ignore how a real hardware would actually work to get there faster
than qemu can.

I had no time to read/try it yet I'm just reading the thread here...

Thanks,
Andrea

2011-04-08 19:48:56

by gene heskett

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Friday, April 08, 2011, Scott Wood wrote:
>On Thu, 7 Apr 2011 21:14:06 -0500
>
>Anthony Liguori <[email protected]> wrote:
>> If someone was going to seriously go about doing something like this, a
>> better approach would be to start with QEMU and remove anything non-x86
>> and all of the UI/command line/management bits and start there.
>>
>> There's nothing more I'd like to see than a viable alternative to QEMU
>> but ignoring any of the architectural mistakes in QEMU and repeating
>> them in a new project isn't going to get there.
>
>Supporting only a single architecture sounds like a significant
>architectural mistake... only x86 deserves clean code?
>
>-Scott

Speaking as someone who hasn't a hand on either end of the oar in this
effort, please folks, the choice of "KVM" as a name for this project is
very poor, mainly because it has already been taken by the very commonly
used little hardware switch that allows one Keyboard, Video, Mouse kit on
the desktop to service 2 or more computers under the desk. To all the Joe
Sixpacks out there who hear that name, and go looking for the box with the
pushbottons on it to switch computers, it is bound to be very confusing.
Surely there is a short, descriptive name for this that isn't "KVM"?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
<http://tinyurl.com/ddg5bz>
<http://www.cantrip.org/gatto.html>
"Catch a wave and you're sitting on top of the world."
- The Beach Boys

2011-04-08 22:58:17

by Anthony Liguori

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/08/2011 10:59 AM, Scott Wood wrote:
> On Thu, 7 Apr 2011 21:14:06 -0500
> Anthony Liguori<[email protected]> wrote:
>
>> If someone was going to seriously go about doing something like this, a
>> better approach would be to start with QEMU and remove anything non-x86
>> and all of the UI/command line/management bits and start there.
>>
>> There's nothing more I'd like to see than a viable alternative to QEMU
>> but ignoring any of the architectural mistakes in QEMU and repeating
>> them in a new project isn't going to get there.
> Supporting only a single architecture sounds like a significant
> architectural mistake... only x86 deserves clean code?

No, you just have to start somewhere. Since x86 is probably the
ugliest, I think it's the best place to start.

Regards,

Anthony Liguori

> -Scott
>

2011-04-08 22:59:44

by Anthony Liguori

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/08/2011 02:20 PM, Andrea Arcangeli wrote:
> Hi Anthony,
>
> On Fri, Apr 08, 2011 at 09:00:43AM -0500, Anthony Liguori wrote:
>> An example is ioport_ops. This maps directly to
>> ioport_{read,write}_table in QEMU. Then you use ioport__register() to
>> register entries in this table similar register_ioport_{read,write}() in
>> QEMU.
>>
>> The use of a struct is a small improvement but the fundamental design is
>> flawed because it models a view of hardware where all devices are
>> directly connected to the CPU. This is not how hardware works at all.
> Not sure if I've the whole picture on this but I see no answer to your
> email and I found your remark above the most interesting. This is
> because I thought the whole point of a native kvm tool was to go all
> the paravirt way to provide max performance and maybe also depend on
> vhost as much as possible.

Yeah, if that's the goal, skip all the mini-BIOS junk and just rely on a
PV kernel in the guest.

I think a mini userspace that assumes that we can change the guest
kernel and avoids having a ton of complexity to do things like CMOS
emulation would be a really interesting thing to do.

Regards,

Anthony Liguori

2011-04-09 07:40:34

by Ingo Molnar

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool


* Andrea Arcangeli <[email protected]> wrote:

> [...] I thought the whole point of a native kvm tool was to go all the
> paravirt way to provide max performance and maybe also depend on vhost as
> much as possible.

To me it's more than that: today i can use it to minimally boot test various
native bzImages just by typing:

kvm run ./bzImage

this will get me past most of the kernel init, up to the point where it would
try to mount user-space. ( That's rather powerful to me personally, as i
introduce most of my bugs to these stages of kernel bootup - and as a kernel
developer i'm not alone there ;-)

I would be sad if i were forced to compile in some sort of paravirt support,
just to be able to boot-test random native kernel images.

Really, if you check the code, serial console and timer support is not a big
deal complexity-wise and it is rather useful:

git pull git://github.com/penberg/linux-kvm master

So i think up to a point hardware emulation is both fun to implement (it's fun
to be on the receiving end of hw calls, for a change) and a no-brainer to have
from a usability POV. How far it wants to go we'll see! :-)

Thanks,

Ingo

2011-04-09 18:23:55

by Olivier Galibert

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Fri, Apr 08, 2011 at 09:00:43AM -0500, Anthony Liguori wrote:
> Really, having a flat table doesn't make sense. You should just send
> everything to an i440fx directly. Then the i440fx should decode what it
> can, and send it to the next level, and so forth.

No you shouldn't. The i440fx should merge and arbitrate the mappings
and then push *direct* links to the handling functions at the top
level. Mapping changes don't happen often on modern hardware, and
decoding is expensive. Incidentally, you can have special handling
functions which are in reality references to kernel handlers,
shortcutting userspace entirely for critical ports/mmio ranges.

OG.

2011-04-10 02:54:44

by Anthony Liguori

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/09/2011 01:23 PM, Olivier Galibert wrote:
> On Fri, Apr 08, 2011 at 09:00:43AM -0500, Anthony Liguori wrote:
>> Really, having a flat table doesn't make sense. You should just send
>> everything to an i440fx directly. Then the i440fx should decode what it
>> can, and send it to the next level, and so forth.
> No you shouldn't. The i440fx should merge and arbitrate the mappings
> and then push *direct* links to the handling functions at the top
> level. Mapping changes don't happen often on modern hardware, and
> decoding is expensive.

Decoding is not all that expensive. For non-PCI devices, the addresses
are almost always fixed so it becomes a series of conditionals and
function calls with a length of no more than 3 or 4.

For PCI devices, any downstream devices are going to fall into specific
regions that the bridge registers. Even in the pathological case of a
bus populated with 32 multi-function devices each having 6 bars, it's
still a non-overlapping list of ranges. There's nothing that prevents
you from storing a sorted version of the list such that you can binary
search to the proper dispatch device. Binary searching a list of 1500
entries is quite fast.

In practice, you have no more than 10-20 PCI devices with each device
having 2-3 bars. A simple linear search is not going to have a
noticeable overhead.

> Incidentally, you can have special handling
> functions which are in reality references to kernel handlers,
> shortcutting userspace entirely for critical ports/mmio ranges.

The cost here is the trip from the guest to userspace and back. If you
want to short cut in the kernel, you have to do that *before* returning
to userspace. In that case, how userspace models I/O flow doesn't matter.

The reason flow matters is that PCI controllers alter I/O. Most PCI
devices use little endian for device registers and some big endian
oriented buses will automatically do endian conversion.

Even without those types of controllers, if you use a native endian API,
an MMIO dispatch API is going to do endian conversion to the target
architecture. However, if you're expecting to return the data in little
endian (as PCI registers are expected to usually be), you need to flip
the endianness.

In QEMU, we handle this by registering bars with a function pointer
trampoline to do this. But this is with the special API. If you hook
the mapping API, you'll probably get this wrong.

Regards,

Anthony Liguori

> OG.

2011-04-10 08:05:56

by Avi Kivity

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On 04/09/2011 01:59 AM, Anthony Liguori wrote:
>
> Yeah, if that's the goal, skip all the mini-BIOS junk and just rely on
> a PV kernel in the guest.
>
> I think a mini userspace that assumes that we can change the guest
> kernel and avoids having a ton of complexity to do things like CMOS
> emulation would be a really interesting thing to do.
>

Changing the guest kernel is a lot more complicated than writing a full
BIOS or the legacy devices needed to support it. See Xen for an example.

--
error compiling committee.c: too many arguments to function

2011-04-12 00:59:29

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [ANNOUNCE] Native Linux KVM tool

On Sat, Apr 09, 2011 at 09:40:09AM +0200, Ingo Molnar wrote:
>
> * Andrea Arcangeli <[email protected]> wrote:
>
> > [...] I thought the whole point of a native kvm tool was to go all the
> > paravirt way to provide max performance and maybe also depend on vhost as
> > much as possible.

BTW, I should elaborate on the "all the paravirt way", going 100%
paravirt isn't what I meant. I was thinking at the performance
critical drivers mainly like storage and network. The kvm tool could
be more hackable and evolve faster by exposing a single hardware view
to the linux guest (using only paravirt whenever that improves
performance, like network/storage).

Whenever full emulation doesn't affect any fast path, it should be
preferred rather than inventing new paravirt interfaces for no
good.

That for example applies first and foremost to the EPT support which
is simpler and more optimal than any shadow paravirt pagetables. It'd
be a dead end to do all in paravirt performance-wise. I definitely
didn't mean any resemblance to lguest when I said full paravirt ;).
Sorry for the confusion.

> To me it's more than that: today i can use it to minimally boot test various
> native bzImages just by typing:
>
> kvm run ./bzImage
>
> this will get me past most of the kernel init, up to the point where it would
> try to mount user-space. ( That's rather powerful to me personally, as i
> introduce most of my bugs to these stages of kernel bootup - and as a kernel
> developer i'm not alone there ;-)
>
> I would be sad if i were forced to compile in some sort of paravirt support,
> just to be able to boot-test random native kernel images.
>
> Really, if you check the code, serial console and timer support is not a big
> deal complexity-wise and it is rather useful:

Agree with that.

>
> git pull git://github.com/penberg/linux-kvm master
>
> So i think up to a point hardware emulation is both fun to implement (it's fun
> to be on the receiving end of hw calls, for a change) and a no-brainer to have
> from a usability POV. How far it wants to go we'll see! :-)

About using the kvm tool as a debugging tool I don't see the point
though. It's very unlikely the kvm tool will ever be able to match
qemu power and capabilities for debugging, in fact qemu will allow you
to do basic debug of several device drivers too (e1000, IDE etc...). I
don't really see the point of the kvm tool as a debugging tool
considering how qemu is mature in terms of monitor memory inspection
commands and gdbstub for that, if it's debug you're going after adding
more features to the qemu monitor looks a better way to go.

The only way I see this useful is to lead it into a full performance
direction, using paravirt whenever it saves CPU (like virtio-blk,
vhost-net) and allow it to scale to hundred of cpus doing I/O
simultaneously and get there faster than qemu. Now smp scaling with
qemu-kvm driver backends hasn't been a big issue according to Avi, so
it's not like we're under pressure from it, but clearly someday it may
become a bigger issue and having less drivers to deal with (especially
only having vhost-blk in userland with vhost-net already being in the
kernel) may provide an advantage in allowing a more performance
oriented implementation of the backends without breaking lots of
existing and valuable full-emulated drivers.

In terms of pure kernel debugging I'm afraid this will be dead end and
for the kernel testing you describe I think qemu-kvm will work best
already. We already have a simpler kvm support in qemu (vs qemu-kvm)
and we don't want a third that is even slower than qemu kvm support,
so it has to be faster than qemu-kvm or nothing IMHO :).