2013-07-19 09:07:16

by Ramkumar Ramachandra

[permalink] [raw]
Subject: [QUERY] lguest64

Hi,

I tried building lguest to play with it, but was disappointed to find
this in the Kconfig:

depends on X86_32

Why is this [1]? What is so hard about supporting 64-bit machines? I
found a five-year old tree that claims to do lguest64 [2], but didn't
investigate further.

Thanks.

[1]: More importantly, who runs 32-bit kernels anymore?
[2]: http://git.et.redhat.com/?p=kernel-lguest-64.git;a=summary


2013-07-19 17:29:59

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On 07/19/2013 02:06 AM, Ramkumar Ramachandra wrote:
> Hi,
>
> I tried building lguest to play with it, but was disappointed to find
> this in the Kconfig:
>
> depends on X86_32
>
> Why is this [1]? What is so hard about supporting 64-bit machines? I
> found a five-year old tree that claims to do lguest64 [2], but didn't
> investigate further.
>

Please don't have us deal with another lguest unless there is a use case
for it. We want to reduce pvops and pvops users, not increase them...

-hpa

2013-07-19 17:43:24

by Ramkumar Ramachandra

[permalink] [raw]
Subject: Re: [QUERY] lguest64

H. Peter Anvin wrote:
> We want to reduce pvops and pvops users, not increase them...

I see. So the future is true virtualization which exposes the
underlying hardware, like KVM? Why do bare-metal virtualizers like
Xen employ paravirtualization? Also, where does UML stand?

Thanks.

2013-07-19 18:47:09

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On 07/19/2013 10:42 AM, Ramkumar Ramachandra wrote:
> H. Peter Anvin wrote:
>> We want to reduce pvops and pvops users, not increase them...
>
> I see. So the future is true virtualization which exposes the
> underlying hardware, like KVM? Why do bare-metal virtualizers like
> Xen employ paravirtualization? Also, where does UML stand?
>

UML, lguest and Xen were done before the x86 architecture supported
hardware virtualization. UML does paravirtualization without needing
hooks all over the kernel, which is really impressive, but unfortunately
rather slow, which makes it useful mostly for testing.

I did at some point wonder if UML would make a decent base platform for
something similar to libguestfs, but on KVM-enabled hardware KVM seems
like the better option (and is indeed what libguestfs uses.)

-hpa

2013-07-19 20:36:35

by Richard Weinberger

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Fri, Jul 19, 2013 at 8:46 PM, H. Peter Anvin <[email protected]> wrote:
> On 07/19/2013 10:42 AM, Ramkumar Ramachandra wrote:
>> H. Peter Anvin wrote:
>>> We want to reduce pvops and pvops users, not increase them...
>>
>> I see. So the future is true virtualization which exposes the
>> underlying hardware, like KVM? Why do bare-metal virtualizers like
>> Xen employ paravirtualization? Also, where does UML stand?

UML is a nice thingy because it is Linux ported to itself, but it has
limitations.
Mostly it's speed (pagefaults and system calls are really slow) and it
supports only x86/x86_64.

I use UML on systems where KVM is not available.

--
Thanks,
//richard

2013-07-23 04:25:44

by Rusty Russell

[permalink] [raw]
Subject: Re: [QUERY] lguest64

"H. Peter Anvin" <[email protected]> writes:
> On 07/19/2013 02:06 AM, Ramkumar Ramachandra wrote:
>> Hi,
>>
>> I tried building lguest to play with it, but was disappointed to find
>> this in the Kconfig:
>>
>> depends on X86_32
>>
>> Why is this [1]? What is so hard about supporting 64-bit machines? I
>> found a five-year old tree that claims to do lguest64 [2], but didn't
>> investigate further.
>>
>
> Please don't have us deal with another lguest unless there is a use case
> for it. We want to reduce pvops and pvops users, not increase them...
>
> -hpa

Yes, the subset of x86-64 machines for which there isn't hardware
virtualization support is pretty uninteresting.

Cheers,
Rusty.

2013-07-31 09:39:26

by Mike Rapoport

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Tue, Jul 23, 2013 at 4:28 AM, Rusty Russell <[email protected]> wrote:
> "H. Peter Anvin" <[email protected]> writes:
>> On 07/19/2013 02:06 AM, Ramkumar Ramachandra wrote:
>>> Hi,
>>>
>>> I tried building lguest to play with it, but was disappointed to find
>>> this in the Kconfig:
>>>
>>> depends on X86_32
>>>
>>> Why is this [1]? What is so hard about supporting 64-bit machines? I
>>> found a five-year old tree that claims to do lguest64 [2], but didn't
>>> investigate further.
>>>

Sorry for jumping late, but, coincidentally, I was thinking about
extending lguest for 64-bits and while googling about the subj, I've
crossed this thread...

>> Please don't have us deal with another lguest unless there is a use case
>> for it. We want to reduce pvops and pvops users, not increase them...
>>
>> -hpa

The use case I had in mind is to use lguest as a nested hypervisor in
public clouds. As of today, major public clouds do not support nested
virtualization and it's not clear at all if they will expose this
ability in their deployments. Addition of 64-bit support for lguest
won't require changes to pvops and, as far as I can tell, won't change
the number of pvops users...

> Yes, the subset of x86-64 machines for which there isn't hardware
> virtualization support is pretty uninteresting.

There are plenty virtual machines in EC2, Rackspace, HP and other
clouds that do not have hardware virtualization. I believe that
running a hypervisor on them may be pretty interesting.

> Cheers,
> Rusty.


--
Sincerely yours,
Mike.

2013-07-31 12:18:09

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On 07/31/2013 02:39 AM, Mike Rapoport wrote:
>
> The use case I had in mind is to use lguest as a nested hypervisor in
> public clouds. As of today, major public clouds do not support nested
> virtualization and it's not clear at all if they will expose this
> ability in their deployments. Addition of 64-bit support for lguest
> won't require changes to pvops and, as far as I can tell, won't change
> the number of pvops users...
>

"We can add a pvops user and that won't change the number of pvops
users" What?!

>> Yes, the subset of x86-64 machines for which there isn't hardware
>> virtualization support is pretty uninteresting.
>
> There are plenty virtual machines in EC2, Rackspace, HP and other
> clouds that do not have hardware virtualization. I believe that
> running a hypervisor on them may be pretty interesting.

The big problem with pvops is that they are a permanent tax on future
development -- a classic case of "the hooks problem." As such it is
important that there be a real, significant, use case with enough users
to make the pain worthwhile. With Xen looking at sunsetting PV support
with a long horizon, it might currently be possible to remove pvops some
time in the early 2020s or so timeframe. Introducing and promoting a
new user now would definitely make that impossible.

So it matters that the use case be real.

-hpa

2013-07-31 13:07:19

by Mike Rapoport

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Wed, Jul 31, 2013 at 3:17 PM, H. Peter Anvin <[email protected]> wrote:
> On 07/31/2013 02:39 AM, Mike Rapoport wrote:
>>
>> The use case I had in mind is to use lguest as a nested hypervisor in
>> public clouds. As of today, major public clouds do not support nested
>> virtualization and it's not clear at all if they will expose this
>> ability in their deployments. Addition of 64-bit support for lguest
>> won't require changes to pvops and, as far as I can tell, won't change
>> the number of pvops users...
>>
>
> "We can add a pvops user and that won't change the number of pvops
> users" What?!

We modify existing pvops user, IMHO. lguest is existing pvops user and
my idea was to extend it, rather than add lguest64 alongside lguest32.

>>> Yes, the subset of x86-64 machines for which there isn't hardware
>>> virtualization support is pretty uninteresting.
>>
>> There are plenty virtual machines in EC2, Rackspace, HP and other
>> clouds that do not have hardware virtualization. I believe that
>> running a hypervisor on them may be pretty interesting.
>
> The big problem with pvops is that they are a permanent tax on future
> development -- a classic case of "the hooks problem." As such it is
> important that there be a real, significant, use case with enough users
> to make the pain worthwhile. With Xen looking at sunsetting PV support
> with a long horizon, it might currently be possible to remove pvops some
> time in the early 2020s or so timeframe. Introducing and promoting a
> new user now would definitely make that impossible.

I surely cannot predict how many users there will be for nested
virtualization in public cloud from now till the point when public
cloud providers will allow usage of hardware for that purpose.
Nevertheless, I believe that nested virtualization in public clouds is
a real use case which will have real users.

> So it matters that the use case be real.
>
> -hpa
>

--
Sincerely yours,
Mike.

2013-07-31 13:18:09

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Wed, Jul 31, 2013 at 05:17:35AM -0700, H. Peter Anvin wrote:
> On 07/31/2013 02:39 AM, Mike Rapoport wrote:
> >
> > The use case I had in mind is to use lguest as a nested hypervisor in
> > public clouds. As of today, major public clouds do not support nested
> > virtualization and it's not clear at all if they will expose this
> > ability in their deployments. Addition of 64-bit support for lguest
> > won't require changes to pvops and, as far as I can tell, won't change
> > the number of pvops users...
> >
>
> "We can add a pvops user and that won't change the number of pvops
> users" What?!
>
> >> Yes, the subset of x86-64 machines for which there isn't hardware
> >> virtualization support is pretty uninteresting.
> >
> > There are plenty virtual machines in EC2, Rackspace, HP and other
> > clouds that do not have hardware virtualization. I believe that
> > running a hypervisor on them may be pretty interesting.
>
> The big problem with pvops is that they are a permanent tax on future
> development -- a classic case of "the hooks problem." As such it is
> important that there be a real, significant, use case with enough users
> to make the pain worthwhile. With Xen looking at sunsetting PV support
> with a long horizon, it might currently be possible to remove pvops some

PV MMU parts specifically.

> time in the early 2020s or so timeframe. Introducing and promoting a
> new user now would definitely make that impossible.
>
> So it matters that the use case be real.
>
> -hpa
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-31 13:20:00

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On 07/31/2013 06:07 AM, Mike Rapoport wrote:
>>
>> "We can add a pvops user and that won't change the number of pvops
>> users" What?!
>
> We modify existing pvops user, IMHO. lguest is existing pvops user and
> my idea was to extend it, rather than add lguest64 alongside lguest32.
>

That is nothing but creative accounting, sorry.

>>>> Yes, the subset of x86-64 machines for which there isn't hardware
>>>> virtualization support is pretty uninteresting.
>>>
>>> There are plenty virtual machines in EC2, Rackspace, HP and other
>>> clouds that do not have hardware virtualization. I believe that
>>> running a hypervisor on them may be pretty interesting.
>>
>> The big problem with pvops is that they are a permanent tax on future
>> development -- a classic case of "the hooks problem." As such it is
>> important that there be a real, significant, use case with enough users
>> to make the pain worthwhile. With Xen looking at sunsetting PV support
>> with a long horizon, it might currently be possible to remove pvops some
>> time in the early 2020s or so timeframe. Introducing and promoting a
>> new user now would definitely make that impossible.
>
> I surely cannot predict how many users there will be for nested
> virtualization in public cloud from now till the point when public
> cloud providers will allow usage of hardware for that purpose.
> Nevertheless, I believe that nested virtualization in public clouds is
> a real use case which will have real users.
>

Then that will show... however, whether or not lguest64 will be used for
that purpose is anyone's guess. I suspect personally that people will
use the already-deployed Xen PV for that purpose and it will stretch the
lifespan of that technology.

Now, nested PV is an even uglier case, and at least some public clouds
are using PV at the base layer.

-hpa

2013-07-31 13:25:40

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On 07/31/2013 06:17 AM, Konrad Rzeszutek Wilk wrote:
>>
>> The big problem with pvops is that they are a permanent tax on future
>> development -- a classic case of "the hooks problem." As such it is
>> important that there be a real, significant, use case with enough users
>> to make the pain worthwhile. With Xen looking at sunsetting PV support
>> with a long horizon, it might currently be possible to remove pvops some
>
> PV MMU parts specifically.
>

Pretty much stuff that is driverized on plain hardware doesn't matter.
What are you looking at with respect to the basic CPU control state?

-hpa

2013-07-31 14:32:49

by Mike Rapoport

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Wed, Jul 31, 2013 at 4:19 PM, H. Peter Anvin <[email protected]> wrote:
> On 07/31/2013 06:07 AM, Mike Rapoport wrote:
>>>
>>> "We can add a pvops user and that won't change the number of pvops
>>> users" What?!
>>
>> We modify existing pvops user, IMHO. lguest is existing pvops user and
>> my idea was to extend it, rather than add lguest64 alongside lguest32.
>>
>
> That is nothing but creative accounting, sorry.

If you count Xen PV 32 and 64 as two pvops users than indeed so :)

>>>>> Yes, the subset of x86-64 machines for which there isn't hardware
>>>>> virtualization support is pretty uninteresting.
>>>>
>>>> There are plenty virtual machines in EC2, Rackspace, HP and other
>>>> clouds that do not have hardware virtualization. I believe that
>>>> running a hypervisor on them may be pretty interesting.
>>>
>>> The big problem with pvops is that they are a permanent tax on future
>>> development -- a classic case of "the hooks problem." As such it is
>>> important that there be a real, significant, use case with enough users
>>> to make the pain worthwhile. With Xen looking at sunsetting PV support
>>> with a long horizon, it might currently be possible to remove pvops some
>>> time in the early 2020s or so timeframe. Introducing and promoting a
>>> new user now would definitely make that impossible.
>>
>> I surely cannot predict how many users there will be for nested
>> virtualization in public cloud from now till the point when public
>> cloud providers will allow usage of hardware for that purpose.
>> Nevertheless, I believe that nested virtualization in public clouds is
>> a real use case which will have real users.
>
> Then that will show... however, whether or not lguest64 will be used for
> that purpose is anyone's guess. I suspect personally that people will
> use the already-deployed Xen PV for that purpose and it will stretch the
> lifespan of that technology.

Well, nesting Xen PV in a cloud VM, even fully-virtualied, seems to me
significantly more complicated than nesting an lguest.

> Now, nested PV is an even uglier case, and at least some public clouds
> are using PV at the base layer.

Unfortunately, majority of public cloud VMs are PV. And, indeed,
nested PV is not nice...

> -hpa
>
>

--
Sincerely yours,
Mike.

2013-07-31 15:31:20

by Borislav Petkov

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Wed, Jul 31, 2013 at 12:39:23PM +0300, Mike Rapoport wrote:
> There are plenty virtual machines in EC2, Rackspace, HP and other
> clouds that do not have hardware virtualization. I believe that
> running a hypervisor on them may be pretty interesting.

Interesting how?

How interesting is it really to run nested on a public
surveillance^Wcloud platform vs say, using nested kvm on your own
machine?

What are those very important use cases which warrant growing more of
that pvops gunk^Wcreativity?

:-)

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-08-01 07:18:46

by Mike Rapoport

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Wed, Jul 31, 2013 at 6:31 PM, Borislav Petkov <[email protected]> wrote:
> On Wed, Jul 31, 2013 at 12:39:23PM +0300, Mike Rapoport wrote:
>> There are plenty virtual machines in EC2, Rackspace, HP and other
>> clouds that do not have hardware virtualization. I believe that
>> running a hypervisor on them may be pretty interesting.
>
> Interesting how?
>
> How interesting is it really to run nested on a public
> surveillance^Wcloud platform vs say, using nested kvm on your own
> machine?

There are people that use public clouds^Wsurveliance platforms. Why do
they prefer these platforms over kvm on their own machines is another
question. But, it is their choice and their right to use whatever
platform they'd like. I've only suggested to provide them with ability
to use nested virtualizatoin on their platform...

> What are those very important use cases which warrant growing more of
> that pvops gunk^Wcreativity?

For instance, you can run several exact copies of OpenStack deployment
on Amazon EC2 :-)

> :-)
>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --

--
Sincerely yours,
Mike.

2013-08-01 17:22:44

by Ramkumar Ramachandra

[permalink] [raw]
Subject: Re: [QUERY] lguest64

H. Peter Anvin wrote:
> UML, lguest and Xen were done before the x86 architecture supported
> hardware virtualization.

[...]

> but on KVM-enabled hardware KVM seems
> like the better option (and is indeed what libguestfs uses.)

While we're still on the topic, I'd like a few clarifications. From
your reply, I got the impression that KVM the only mechanism for
non-pvops virtualization. This seems quite contrary to what I read on
lwn about ARM virtualization [1]. In short, ARM provides a "hypervisor
mode", and the article says

"the virtualization model provided by ARM fits the Xen
hypervisor-based virtualization better than KVM's kernel-based model"

The Xen people call this "ARM PVH" (as opposed to ARM PV, which does
not utilize hardware extensions) [2]. Although I wasn't able to find
much information about the hardware aspect, what ARM provides seems to
be quite different from VT-x and AMD-V. I'm also confused about what
virt/kvm/arm is.

Thanks.

[1]: http://lwn.net/Articles/513940/
[2]: http://www.xenproject.org/developers/teams/arm-hypervisor.html

2013-08-01 20:01:02

by Alex Elsayed

[permalink] [raw]
Subject: Re: [QUERY] lguest64

Ramkumar Ramachandra wrote:

> H. Peter Anvin wrote:
>> UML, lguest and Xen were done before the x86 architecture supported
>> hardware virtualization.
>
> [...]
>
>> but on KVM-enabled hardware KVM seems
>> like the better option (and is indeed what libguestfs uses.)
>
> While we're still on the topic, I'd like a few clarifications. From
> your reply, I got the impression that KVM the only mechanism for
> non-pvops virtualization. This seems quite contrary to what I read on
> lwn about ARM virtualization [1]. In short, ARM provides a "hypervisor
> mode", and the article says
>
> "the virtualization model provided by ARM fits the Xen
> hypervisor-based virtualization better than KVM's kernel-based model"
>
> The Xen people call this "ARM PVH" (as opposed to ARM PV, which does
> not utilize hardware extensions) [2]. Although I wasn't able to find
> much information about the hardware aspect, what ARM provides seems to
> be quite different from VT-x and AMD-V. I'm also confused about what
> virt/kvm/arm is.
>
> Thanks.
>
> [1]: http://lwn.net/Articles/513940/
> [2]: http://www.xenproject.org/developers/teams/arm-hypervisor.html

ARM's virtualization extensions may be a more *natural* match to Xen's
semantics and architecture, but that doesn't mean that KVM can't use it. LWN
explains the details far better than I can: https://lwn.net/Articles/557132/

virt/kvm/arm is an implementation of KVM (the API) that takes advantage of
ARM's virtualization extensions.

2013-08-02 02:05:12

by Rusty Russell

[permalink] [raw]
Subject: Re: [QUERY] lguest64

Mike Rapoport <[email protected]> writes:
> On Wed, Jul 31, 2013 at 3:17 PM, H. Peter Anvin <[email protected]> wrote:
>> On 07/31/2013 02:39 AM, Mike Rapoport wrote:
>>>
>>> The use case I had in mind is to use lguest as a nested hypervisor in
>>> public clouds. As of today, major public clouds do not support nested
>>> virtualization and it's not clear at all if they will expose this
>>> ability in their deployments. Addition of 64-bit support for lguest
>>> won't require changes to pvops and, as far as I can tell, won't change
>>> the number of pvops users...
>>>
>>
>> "We can add a pvops user and that won't change the number of pvops
>> users" What?!
>
> We modify existing pvops user, IMHO. lguest is existing pvops user and
> my idea was to extend it, rather than add lguest64 alongside lguest32.

Well, lguest is particularly expendable. It's the red shirt of the
virtualization away team.

Unlike HPA, I would advocate for applying the patches if you produced
them. But I'd be aware that they're likely to be ripped out as soon as
pvops has no other users.

Cheers,
Rusty.

2013-08-02 14:27:58

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [QUERY] lguest64

We'll look at them and consider to what degree they are likely to cause issues, but as Rusty says, it is the red shirt.

UML is also PV of course, but sits in its own corner and we x86 maintainers very rarely have to do something special to accommodate it.

Rusty Russell <[email protected]> wrote:
>Mike Rapoport <[email protected]> writes:
>> On Wed, Jul 31, 2013 at 3:17 PM, H. Peter Anvin <[email protected]>
>wrote:
>>> On 07/31/2013 02:39 AM, Mike Rapoport wrote:
>>>>
>>>> The use case I had in mind is to use lguest as a nested hypervisor
>in
>>>> public clouds. As of today, major public clouds do not support
>nested
>>>> virtualization and it's not clear at all if they will expose this
>>>> ability in their deployments. Addition of 64-bit support for lguest
>>>> won't require changes to pvops and, as far as I can tell, won't
>change
>>>> the number of pvops users...
>>>>
>>>
>>> "We can add a pvops user and that won't change the number of pvops
>>> users" What?!
>>
>> We modify existing pvops user, IMHO. lguest is existing pvops user
>and
>> my idea was to extend it, rather than add lguest64 alongside
>lguest32.
>
>Well, lguest is particularly expendable. It's the red shirt of the
>virtualization away team.
>
>Unlike HPA, I would advocate for applying the patches if you produced
>them. But I'd be aware that they're likely to be ripped out as soon as
>pvops has no other users.
>
>Cheers,
>Rusty.

--
Sent from my mobile phone. Please excuse brevity and lack of formatting.

2013-08-02 19:09:46

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Wed, Jul 31, 2013 at 06:25:04AM -0700, H. Peter Anvin wrote:
> On 07/31/2013 06:17 AM, Konrad Rzeszutek Wilk wrote:
> >>
> >> The big problem with pvops is that they are a permanent tax on future
> >> development -- a classic case of "the hooks problem." As such it is
> >> important that there be a real, significant, use case with enough users
> >> to make the pain worthwhile. With Xen looking at sunsetting PV support
> >> with a long horizon, it might currently be possible to remove pvops some
> >
> > PV MMU parts specifically.
> >
>
> Pretty much stuff that is driverized on plain hardware doesn't matter.
> What are you looking at with respect to the basic CPU control state?


CC-ing Mukesh here.

Let me iterate down what the experimental patch uses:

struct pv_init_ops pv_init_ops;
[still use xen_patch, but I think that is not needed anymore]

struct pv_time_ops pv_time_ops;
[we need that as we are using the PV clock source]

struct pv_cpu_ops pv_cpu_ops;
[only end up using cpuid. This one is a tricky one. We could
arguable remove it but it does do some filtering - for example
THERM is turned off, or MWAIT if a certain hypercall tells us to
disable that. Since this is now a trapped operation this could be
handled in the hypervisor - but then it would be in charge of
filtering certain CPUID - and this is at bootup - so there is not
user interaction. This needs a bit more of thinking]

struct pv_irq_ops pv_irq_ops;
[none so far, we use normal sti/cli

struct pv_apic_ops pv_apic_ops;
[we over-write them without own event channel logic for IPI's, etc.
Thought with virtualized APIC this could be done differently and
some Intel engineers told me that they have it on their roadmap]

struct pv_mmu_ops pv_mmu_ops;
[we use two:
- .flush_tlb_others (xen_flush_tlb_others) - and I think we
actually remove that. Mukesh, do you recall why we need it?
- .pagetable_init - but that can be moved out as the
only reason it does that is to use a new address (__va)
on the shared page (it swaps out of using the __kva to
using __va).]

struct pv_lock_ops pv_lock_ops;
[still using that]


Please please take this with a grain of salt. The patches are still experimental
so we might be missing something and this is not set in stone.

2013-08-04 12:37:21

by Gleb Natapov

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Fri, Aug 02, 2013 at 03:09:34PM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 31, 2013 at 06:25:04AM -0700, H. Peter Anvin wrote:
> > On 07/31/2013 06:17 AM, Konrad Rzeszutek Wilk wrote:
> > >>
> > >> The big problem with pvops is that they are a permanent tax on future
> > >> development -- a classic case of "the hooks problem." As such it is
> > >> important that there be a real, significant, use case with enough users
> > >> to make the pain worthwhile. With Xen looking at sunsetting PV support
> > >> with a long horizon, it might currently be possible to remove pvops some
> > >
> > > PV MMU parts specifically.
> > >
> >
> > Pretty much stuff that is driverized on plain hardware doesn't matter.
> > What are you looking at with respect to the basic CPU control state?
>
>
> CC-ing Mukesh here.
>
> Let me iterate down what the experimental patch uses:
>
> struct pv_init_ops pv_init_ops;
> [still use xen_patch, but I think that is not needed anymore]
>
> struct pv_time_ops pv_time_ops;
> [we need that as we are using the PV clock source]
>
> struct pv_cpu_ops pv_cpu_ops;
> [only end up using cpuid. This one is a tricky one. We could
> arguable remove it but it does do some filtering - for example
> THERM is turned off, or MWAIT if a certain hypercall tells us to
> disable that. Since this is now a trapped operation this could be
> handled in the hypervisor - but then it would be in charge of
> filtering certain CPUID - and this is at bootup - so there is not
> user interaction. This needs a bit more of thinking]
>
read_msr/write_msr in this one make all msr accesses safe. IIRC there
are MSRs that Linux uses without checking cpuid bits.
IA32_PERF_CAPABILITIES for instance is used without checking PDCM bit.


--
Gleb.

2013-08-05 16:51:07

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Sun, Aug 04, 2013 at 03:37:08PM +0300, Gleb Natapov wrote:
> On Fri, Aug 02, 2013 at 03:09:34PM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jul 31, 2013 at 06:25:04AM -0700, H. Peter Anvin wrote:
> > > On 07/31/2013 06:17 AM, Konrad Rzeszutek Wilk wrote:
> > > >>
> > > >> The big problem with pvops is that they are a permanent tax on future
> > > >> development -- a classic case of "the hooks problem." As such it is
> > > >> important that there be a real, significant, use case with enough users
> > > >> to make the pain worthwhile. With Xen looking at sunsetting PV support
> > > >> with a long horizon, it might currently be possible to remove pvops some
> > > >
> > > > PV MMU parts specifically.
> > > >
> > >
> > > Pretty much stuff that is driverized on plain hardware doesn't matter.
> > > What are you looking at with respect to the basic CPU control state?
> >
> >
> > CC-ing Mukesh here.
> >
> > Let me iterate down what the experimental patch uses:
> >
> > struct pv_init_ops pv_init_ops;
> > [still use xen_patch, but I think that is not needed anymore]
> >
> > struct pv_time_ops pv_time_ops;
> > [we need that as we are using the PV clock source]
> >
> > struct pv_cpu_ops pv_cpu_ops;
> > [only end up using cpuid. This one is a tricky one. We could
> > arguable remove it but it does do some filtering - for example
> > THERM is turned off, or MWAIT if a certain hypercall tells us to
> > disable that. Since this is now a trapped operation this could be
> > handled in the hypervisor - but then it would be in charge of
> > filtering certain CPUID - and this is at bootup - so there is not
> > user interaction. This needs a bit more of thinking]
> >
> read_msr/write_msr in this one make all msr accesses safe. IIRC there
> are MSRs that Linux uses without checking cpuid bits.
> IA32_PERF_CAPABILITIES for instance is used without checking PDCM bit.

Right, those are needed as well. Completly forgot about them.
>
>
> --
> Gleb.

2013-08-05 16:59:33

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On 08/05/2013 09:50 AM, Konrad Rzeszutek Wilk wrote:
>>>
>>> Let me iterate down what the experimental patch uses:
>>>
>>> struct pv_init_ops pv_init_ops;
>>> [still use xen_patch, but I think that is not needed anymore]
>>>
>>> struct pv_time_ops pv_time_ops;
>>> [we need that as we are using the PV clock source]
>>>
>>> struct pv_cpu_ops pv_cpu_ops;
>>> [only end up using cpuid. This one is a tricky one. We could
>>> arguable remove it but it does do some filtering - for example
>>> THERM is turned off, or MWAIT if a certain hypercall tells us to
>>> disable that. Since this is now a trapped operation this could be
>>> handled in the hypervisor - but then it would be in charge of
>>> filtering certain CPUID - and this is at bootup - so there is not
>>> user interaction. This needs a bit more of thinking]
>>>
>> read_msr/write_msr in this one make all msr accesses safe. IIRC there
>> are MSRs that Linux uses without checking cpuid bits.
>> IA32_PERF_CAPABILITIES for instance is used without checking PDCM bit.
>
> Right, those are needed as well. Completly forgot about them.

CPUID is not too bad. RDMSR/WRMSR is actually worse since there are
some MSRs which are performance-critical. The really messy pvops are
the memory-related ones, as they don't match the hardware behavior.

Similarly, beyond pvops, what new assumptions does this code add to the
code base?

-hpa

2013-08-05 17:17:20

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [QUERY] lguest64

> >>> struct pv_cpu_ops pv_cpu_ops;
> >>> [only end up using cpuid. This one is a tricky one. We could
> >>> arguable remove it but it does do some filtering - for example
> >>> THERM is turned off, or MWAIT if a certain hypercall tells us to
> >>> disable that. Since this is now a trapped operation this could be
> >>> handled in the hypervisor - but then it would be in charge of
> >>> filtering certain CPUID - and this is at bootup - so there is not
> >>> user interaction. This needs a bit more of thinking]
> >>>
> >> read_msr/write_msr in this one make all msr accesses safe. IIRC there
> >> are MSRs that Linux uses without checking cpuid bits.
> >> IA32_PERF_CAPABILITIES for instance is used without checking PDCM bit.
> >
> > Right, those are needed as well. Completly forgot about them.
>
> CPUID is not too bad. RDMSR/WRMSR is actually worse since there are
> some MSRs which are performance-critical. The really messy pvops are
> the memory-related ones, as they don't match the hardware behavior.

Would you have a by any chance a nice test-case to demonstrate the
rdmsr/wrmsr paths which performance-critical under baremetal?
>
> Similarly, beyond pvops, what new assumptions does this code add to the
> code base?

We have not yet narrowed down on how to "negotiate" the GDT values - as
the VMX code in the hypervisor has setup those before it loads the kernel.
I think Mukesh was thinking to extend the .Xen.note to enumerate some of the
ones that are needed and somehow the hypervisor slurps them in.

2013-08-08 19:15:30

by Richard W.M. Jones

[permalink] [raw]
Subject: Re: [QUERY] lguest64

On Wed, Jul 31, 2013 at 12:39:23PM +0300, Mike Rapoport wrote:
> On Tue, Jul 23, 2013 at 4:28 AM, Rusty Russell <[email protected]> wrote:
> > Yes, the subset of x86-64 machines for which there isn't hardware
> > virtualization support is pretty uninteresting.
>
> There are plenty virtual machines in EC2, Rackspace, HP and other
> clouds that do not have hardware virtualization. I believe that
> running a hypervisor on them may be pretty interesting.

[Jumping in rather late]

The problem with basing this on lguest is that you would need to
implement a whole lot of stuff from qemu to make lguest really useful
as a modern hypervisor. eg. qcow2 and a variety of other block
devices, kvmclock, virtio{-scsi,-net}. Probably more, but just
implementing those will keep you going for a while. It might also be
feasible to add lguest support to qemu.

However I think it's best to do nothing and use TCG mode in qemu. TCG
is a bit slower than lguest or UML, but definitely not unusable. It's
a drop-in replacement for qemu/KVM with all the same features, and it
works today.

We use and support TCG to make libguestfs work on EC2.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW