2007-02-16 04:47:59

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

Hi Andi,

This patch series implements the Linux Xen guest in terms of the
paravirt-ops interface. The features in implemented this patch series
are:
* domU only
* UP only (most code is SMP-safe, but there's no way to create a new vcpu)
* writable pagetables, with late pinning/early unpinning
(no shadow pagetable support)
* supports both PAE and non-PAE modes
* xen console
* virtual block device (blockfront)
* virtual network device (netfront)

The patch series is in two parts:

1-12: cleanups to the core kernel, either to fix outright problems,
or to add appropriate hooks for Xen
13-21: the Xen guest implementation itself

I've tried to make each patch as self-explanatory as possible. The
series is based on git changeset
ec2f9d1331f658433411c58077871e1eef4ee1b4 +
x86_64-2.6.20-git8-070213-1.patch.

Changes since the previous posting:
- rebased
- addressed review comments:
- deal with missing vga hardware better
- deal with Andi's comments
- clean up header file placement
- update netfront, and move it into drivers/net

I looked at linking in xen-head.S rather than including it into
xen-head.S, but it seems to provoke linker bugs, so I've left it as-is
for now.

Thanks,
J

--


2007-02-16 07:00:35

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

On Thu, 15 Feb 2007 18:24:49 -0800 Jeremy Fitzhardinge <[email protected]> wrote:

> This patch series implements the Linux Xen guest in terms of the
> paravirt-ops interface.

The whole patchset exports 67 symbols to modules. How come?

Are they all needed?

2007-02-16 07:20:47

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

Andrew Morton wrote:
> On Thu, 15 Feb 2007 18:24:49 -0800 Jeremy Fitzhardinge <[email protected]> wrote:
>
>
>> This patch series implements the Linux Xen guest in terms of the
>> paravirt-ops interface.
>>
>
> The whole patchset exports 67 symbols to modules. How come?
>
> Are they all needed?

Yep, pretty much. They're all generally to do with Xen's virtual device
model, and are needed by modular frontend/backed drivers. This series
only includes the basic block and network frontend devices, but there
are more waiting in the wings.

The breakdown, roughly, is:

* event channel management
* pseudophysical <-> machine addresses
* grant-table management
* xenbus, which includes
o which has a filesystem-like namespace
o the means to monitor changes in objects in the namespace
o shared resource management
o suspend/resume



J

2007-02-16 20:49:48

by Christoph Lameter

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

On Thu, 15 Feb 2007, Jeremy Fitzhardinge wrote:

> This patch series implements the Linux Xen guest in terms of the
> paravirt-ops interface. The features in implemented this patch series

I am thoroughly confused. Maybe that is because I have not been following
this issue closely but it seems that you are using the paravirt interface
as an API for Xen code in the guest? I thought the idea of paravirt was to
have an API that is generic? This patchset seems to be mostly realizing
Xen specific functionality? How does the code here interact with KVM,
VMWare and other hypervisors?

2007-02-16 21:04:54

by Zachary Amsden

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

Christoph Lameter wrote:
> On Thu, 15 Feb 2007, Jeremy Fitzhardinge wrote:
>
>
>> This patch series implements the Linux Xen guest in terms of the
>> paravirt-ops interface. The features in implemented this patch series
>>
>
> I am thoroughly confused. Maybe that is because I have not been following
> this issue closely but it seems that you are using the paravirt interface
> as an API for Xen code in the guest? I thought the idea of paravirt was to
> have an API that is generic? This patchset seems to be mostly realizing
> Xen specific functionality? How does the code here interact with KVM,
> VMWare and other hypervisors?
>

For the most part, it doesn't disturb VMware or KVM. Xen does need some
additional functionality in paravirt-ops because they took a different
design choice - direct page tables instead of shadow page tables. This
is where all the requirements for the new Xen paravirt-ops hooks come from.

Zach

2007-02-16 21:13:45

by Christoph Lameter

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

On Fri, 16 Feb 2007, Zachary Amsden wrote:

> For the most part, it doesn't disturb VMware or KVM. Xen does need some
> additional functionality in paravirt-ops because they took a different design
> choice - direct page tables instead of shadow page tables. This is where all
> the requirements for the new Xen paravirt-ops hooks come from.

It still seems to be implemented for Xen and not to support a variety of
page table methods in paravirt ops.

2007-02-16 21:50:35

by Zachary Amsden

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

Christoph Lameter wrote:
> On Fri, 16 Feb 2007, Zachary Amsden wrote:
>
>
>> For the most part, it doesn't disturb VMware or KVM. Xen does need some
>> additional functionality in paravirt-ops because they took a different design
>> choice - direct page tables instead of shadow page tables. This is where all
>> the requirements for the new Xen paravirt-ops hooks come from.
>>
>
> It still seems to be implemented for Xen and not to support a variety of
> page table methods in paravirt ops.
>

Yes, but that is just because the Xen hooks happens to be near the last
part of the merge. VMI required some special hooks, as do both Xen and
lhype (I think ... Rusty can correct me if lhype's puppy's have
precluded the addition of new hooks). Xen page table handling is very
different, mostly it is trap and emulate so writable page tables can
work, which means they don't always issue hypercalls for PTE updates,
although they do have that option, should the hypervisor MMU model
change, or performance concerns prompt a different model (or perhaps,
migration?)

Zach

2007-02-16 21:59:47

by Christoph Lameter

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

On Fri, 16 Feb 2007, Zachary Amsden wrote:

> Yes, but that is just because the Xen hooks happens to be near the last part
> of the merge. VMI required some special hooks, as do both Xen and lhype (I
> think ... Rusty can correct me if lhype's puppy's have precluded the addition
> of new hooks). Xen page table handling is very different, mostly it is trap
> and emulate so writable page tables can work, which means they don't always
> issue hypercalls for PTE updates, although they do have that option, should
> the hypervisor MMU model change, or performance concerns prompt a different
> model (or perhaps, migration?)

Well looks like there are still some major design issues to be ironed out.
What is proposed here is to make paravirt_ops a fake generic
API and then tunnel through it to vendor specific kernel mods.

2007-02-16 22:10:50

by Zachary Amsden

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

Christoph Lameter wrote:
> On Fri, 16 Feb 2007, Zachary Amsden wrote:
>
>
>> Yes, but that is just because the Xen hooks happens to be near the last part
>> of the merge. VMI required some special hooks, as do both Xen and lhype (I
>> think ... Rusty can correct me if lhype's puppy's have precluded the addition
>> of new hooks). Xen page table handling is very different, mostly it is trap
>> and emulate so writable page tables can work, which means they don't always
>> issue hypercalls for PTE updates, although they do have that option, should
>> the hypervisor MMU model change, or performance concerns prompt a different
>> model (or perhaps, migration?)
>>
>
> Well looks like there are still some major design issues to be ironed out.
> What is proposed here is to make paravirt_ops a fake generic
> API and then tunnel through it to vendor specific kernel mods.
>

No, there are two radically different approaches represented in one
API. Shadow page tables and direct page tables require different
abstractions to make them work. The API is not fake. It accommodates
both approaches, and the Xen changes here are pretty much required to
make direct page tables work. The shadow side of the equation is not
vendor specific, in fact, it is used by lhype to make PTE update
hypercalls. But only one vendor chose direct page tables, so it appears
vendor specific, when in fact it is just specific to that design choice.

Adding XenBus hooks to paravirt-ops, for instance, would be vendor
specific and useless to anyone else. But that is not the approach which
has been taken here.

Zach

2007-02-16 23:49:34

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

Christoph Lameter wrote:
> I am thoroughly confused. Maybe that is because I have not been following
> this issue closely but it seems that you are using the paravirt interface
> as an API for Xen code in the guest? I thought the idea of paravirt was to
> have an API that is generic? This patchset seems to be mostly realizing
> Xen specific functionality? How does the code here interact with KVM,
> VMWare and other hypervisors?

There are two things in this patch series: generic kernel changes, and
the core bits of Xen itself.

The earlier part of the patch contains changes to the core kernel which
are needed for Xen, but are generally harmless to non-paravirt use, or
to being virtualized under other hypervisors. In some cases these are
plain bugfixes (like dealing with absent vga hardware), or things which
have become parameterised (like the pgd alignment, or whether we share
the kernel pmd in PAE mode).

But there are also extensions to the paravirt_ops interface. The
largest of these is adding the appropriate hooks for
non-shadowed-pagetable hypervisors. While Xen is the only example of
this at the moment, its not a Xen-specific set of hooks. It allows a
hypervisor backend to have detailed control over what actually gets put
into pagetables; in Xen's case this means we can convert the kernel's
pseudo-physical addresses into machine addresses, but you could imagine
a hypervisor maintaining some other structure in parallel with the
pagetable or something like that. An analogy would be extending the IO
DMA interfaces to account for an IOMMU, even if only one hardware
platform actually has an IOMMU.

The latter part of the series is basically pure Xen-specifc stuff, which
is almost entirely restricted to Xen-specific parts of the tree. This
code introduces a number of new Xen-specific interfaces, but they're
completely distinct from paravirt_ops.

J

2007-02-17 04:59:24

by Rusty Russell

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

On Fri, 2007-02-16 at 12:49 -0800, Christoph Lameter wrote:
> On Thu, 15 Feb 2007, Jeremy Fitzhardinge wrote:
>
> > This patch series implements the Linux Xen guest in terms of the
> > paravirt-ops interface. The features in implemented this patch series
>
> I am thoroughly confused. Maybe that is because I have not been following
> this issue closely but it seems that you are using the paravirt interface
> as an API for Xen code in the guest? I thought the idea of paravirt was to
> have an API that is generic? This patchset seems to be mostly realizing
> Xen specific functionality? How does the code here interact with KVM,
> VMWare and other hypervisors?

It doesn't. Paravirt_ops provides the hooks. KVM, lguest, VMI and Xen
all need to implement what they want on top of them.

Hope that clarifies,
Rusty.

2007-02-17 05:06:35

by Rusty Russell

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

On Fri, 2007-02-16 at 13:48 -0800, Zachary Amsden wrote:
> Christoph Lameter wrote:
> > It still seems to be implemented for Xen and not to support a variety of
> > page table methods in paravirt ops.
>
> Yes, but that is just because the Xen hooks happens to be near the last
> part of the merge. VMI required some special hooks, as do both Xen and
> lhype (I think ... Rusty can correct me if lhype's puppy's have
> precluded the addition of new hooks).

lguest was supposed to be a demonstration of paravirt_ops, so it
shouldn't have added any. But note that I did change some other things,
such as the esp0 initialization for the swapper.

Puppies are still alive and well. Although Andi not pushing into 2.6.21
(yet?) made puppies sad 8(

> Xen page table handling is very
> different, mostly it is trap and emulate so writable page tables can
> work, which means they don't always issue hypercalls for PTE updates,
> although they do have that option, should the hypervisor MMU model
> change, or performance concerns prompt a different model (or perhaps,
> migration?)

Yes, Xen really like their direct pagetable stuff. I'm a
traditionalist, myself, but it did require some expansion of
paravirt_ops.

KVM might well want more, although from here it's more likely we'll move
some of the hooks up the stack a little IMHO.

Cheers,
Rusty.




2007-02-17 13:51:16

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

On Fri, Feb 16, 2007 at 01:59:44PM -0800, Christoph Lameter wrote:
> On Fri, 16 Feb 2007, Zachary Amsden wrote:
>
> > Yes, but that is just because the Xen hooks happens to be near the last part
> > of the merge. VMI required some special hooks, as do both Xen and lhype (I
> > think ... Rusty can correct me if lhype's puppy's have precluded the addition
> > of new hooks). Xen page table handling is very different, mostly it is trap
> > and emulate so writable page tables can work, which means they don't always
> > issue hypercalls for PTE updates, although they do have that option, should
> > the hypervisor MMU model change, or performance concerns prompt a different
> > model (or perhaps, migration?)
>
> Well looks like there are still some major design issues to be ironed out.
> What is proposed here is to make paravirt_ops a fake generic
> API and then tunnel through it to vendor specific kernel mods.

That was always its intention. It's not a direct interface to a hypervisor,
but an somewhat abstracted interface to a "hypervisor driver"

But you're right that there are currently still quite a lot of hooks
being added. I plan to be much more strict on that in the future.

-Andi

2007-02-21 18:37:50

by Christoph Lameter

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

On Sat, 17 Feb 2007, Andi Kleen wrote:

> That was always its intention. It's not a direct interface to a hypervisor,
> but an somewhat abstracted interface to a "hypervisor driver"

I thought that hypervisor driver was some binary blob that can be directly
accessed via paravirt_ops?

> But you're right that there are currently still quite a lot of hooks
> being added. I plan to be much more strict on that in the future.

And it seems that the hooks are not generic but bound to a particular
hypervisor. Should the Xen specific stuff not be in the binary blob?


2007-02-21 18:55:40

by Zachary Amsden

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

Christoph Lameter wrote:
> On Sat, 17 Feb 2007, Andi Kleen wrote:
>
>
>> That was always its intention. It's not a direct interface to a hypervisor,
>> but an somewhat abstracted interface to a "hypervisor driver"
>>
>
> I thought that hypervisor driver was some binary blob that can be directly
> accessed via paravirt_ops?
>

There are no more binary blobs being used by paravirt-ops hypervisors.
I prefer the term "open hypercode layer".


>> But you're right that there are currently still quite a lot of hooks
>> being added. I plan to be much more strict on that in the future.
>>
>
> And it seems that the hooks are not generic but bound to a particular
> hypervisor. Should the Xen specific stuff not be in the binary blob?
>

Xen doesn't use a hypercode layer, and there is no way to do what they
need to do without hooks in the kernel.

Zach

2007-02-21 20:03:09

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [patch 00/21] Xen-paravirt: Xen guest implementation for paravirt_ops interface

Christoph Lameter wrote:
> And it seems that the hooks are not generic but bound to a particular
> hypervisor. Should the Xen specific stuff not be in the binary blob?

Xen has no "binary blob". It needs guests to cooperate with it by
making hypercalls; all that code is in the Xen implementation of the
paravirt_ops interface, which is just ordinary code which lives in the
kernel sources.

J