2013-03-03 09:18:43

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest

On Thu, Feb 28, 2013 at 08:13:10PM +0800, Hu Tao wrote:
> This series implements a new interface, kvm pv event, to notify host when
> some events happen in guest. Right now there is one supported event: guest
> panic.
>
What other event do you have in mind? Is interface generic enough to
accommodate future, yet unknown, events. It allows to pass only one
integer specifying even type, what if additional info is needed? My be
stop pretending that device is generic and make it do once thing but do
it well? For generic even passing interface (whatever it may be needed
for) much more powerful virtio should be used.

On implementation itself I do not understand why is this kvm specific.
The only thing that makes it so is that you hook device initialization
into guest kvm initialization code, but this is obviously incorrect.
What stops QEMU tcg or Xen from reusing the same device for the same
purpose except the artificial limitation in a guest.

Reading data from a random ioports is not how you discover platform
devices in 21 century (and the data you read from unassigned port is not
guarantied to be zero, it may depend on QEMU version), you use ACPI for
that and Marcelo already pointed that to you. Having little knowledge of
ACPI (we all do) is not a good reason to not doing it. We probably need
to reserve QEMU specific ACPI Plug and Play hardware ID to define our own
devices. After that you will be able to create device with _HID(QEMU0001)
in DSDT that supplies address information (ioport to use) and capability
supported. Guest uses acpi_get_devices() to discover a platform device by
its name (QEMU0001). Then you put the driver for the platform device
into drivers/platform/x86/ and QEMU/kvm/Xen all will be able to use it.

On QEMU side of things I cannot comment much on how QOMified the device
is (it should be), I hope other reviews will verify it, but I noticed
that device is only initialized for PIIX, what about Q35?

--
Gleb.


2013-03-04 10:06:45

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest

Il 03/03/2013 10:17, Gleb Natapov ha scritto:
> On Thu, Feb 28, 2013 at 08:13:10PM +0800, Hu Tao wrote:
>> This series implements a new interface, kvm pv event, to notify host when
>> some events happen in guest. Right now there is one supported event: guest
>> panic.
>>
> What other event do you have in mind? Is interface generic enough to
> accommodate future, yet unknown, events. It allows to pass only one
> integer specifying even type, what if additional info is needed? My be
> stop pretending that device is generic and make it do once thing but do
> it well? For generic even passing interface (whatever it may be needed
> for) much more powerful virtio should be used.
>
> On implementation itself I do not understand why is this kvm specific.
> The only thing that makes it so is that you hook device initialization
> into guest kvm initialization code, but this is obviously incorrect.
> What stops QEMU tcg or Xen from reusing the same device for the same
> purpose except the artificial limitation in a guest.

Agreed.

> Reading data from a random ioports is not how you discover platform
> devices in 21 century (and the data you read from unassigned port is not
> guarantied to be zero, it may depend on QEMU version), you use ACPI for
> that and Marcelo already pointed that to you. Having little knowledge of
> ACPI (we all do) is not a good reason to not doing it. We probably need
> to reserve QEMU specific ACPI Plug and Play hardware ID to define our own
> devices. After that you will be able to create device with _HID(QEMU0001)
> in DSDT that supplies address information (ioport to use) and capability
> supported.

Please also document this HID in a new file in the QEMU docs/ directory.

> Guest uses acpi_get_devices() to discover a platform device by
> its name (QEMU0001). Then you put the driver for the platform device
> into drivers/platform/x86/ and QEMU/kvm/Xen all will be able to use it.

Just to clarify it for Hu Tao, the read from a random ioport is how the
ACPI code will detect presence of the device.

Something like this should work (in SeaBIOS's src/acpi-dsdt-isa.dsl):

Device(PEVT) {
Name(_HID, EisaId("QEMU0001"))
OperationRegion(PEOR, SystemIO, 0x505, 0x01)
Field(PEOR, ByteAcc, NoLock, Preserve) {
PEPT, 8,
}

Method(_STA, 0, NotSerialized) {
Store(PEPT, Local0)
If (LEqual(Local0, Zero)) {
Return (0x00)
} Else {
Return (0x0F)
}
}

Name(_CRS, ResourceTemplate() {
IO(Decode16, 0x505, 0x505, 0x01, 0x01)
})
}

Please test this with a QEMU option like "-M pc-1.4". The device should
_not_ be detected if you're doing it right.

> On QEMU side of things I cannot comment much on how QOMified the device
> is (it should be),

Please make the device target-independent. It can be used on non-x86
architectures that have I/O ports. You should make the port
configurable using a property (DEFINE_PROPERTY_INT16 or something like
that), with a default of 0x505.

All the panicked_action is not necessary in my opinion. We have it for
watchdogs, but that's really a legacy thing. Let libvirt do it, and
always make the guest panic perform the PANICKED_PAUSE action.

If you do it properly, a lot (really a lot) of code will go away.

> I hope other reviews will verify it, but I noticed
> that device is only initialized for PIIX, what about Q35?

Yup.

Paolo

2013-03-04 11:22:50

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest

On Mon, Mar 04, 2013 at 11:05:37AM +0100, Paolo Bonzini wrote:
> > Guest uses acpi_get_devices() to discover a platform device by
> > its name (QEMU0001). Then you put the driver for the platform device
> > into drivers/platform/x86/ and QEMU/kvm/Xen all will be able to use it.
>
> Just to clarify it for Hu Tao, the read from a random ioport is how the
> ACPI code will detect presence of the device.
>
Actually no (at least in the long run, for the first version it may be
OK). Since we want to move DSDT generation into QEMU if device will not
be present QEMU will not generate corresponded Device() in DSDT, or it
will generate it with _STA() { Return (0x00)} hard coded. Seabios can do
the same if we will pass it info about device presence via fw_cfg. Not
sure Kevin will like it now when we plan to move DSDT into QEMU anyway :)

> Something like this should work (in SeaBIOS's src/acpi-dsdt-isa.dsl):
>
> Device(PEVT) {
> Name(_HID, EisaId("QEMU0001"))
> OperationRegion(PEOR, SystemIO, 0x505, 0x01)
> Field(PEOR, ByteAcc, NoLock, Preserve) {
> PEPT, 8,
> }
>
> Method(_STA, 0, NotSerialized) {
> Store(PEPT, Local0)
> If (LEqual(Local0, Zero)) {
> Return (0x00)
> } Else {
> Return (0x0F)
> }
> }
>
> Name(_CRS, ResourceTemplate() {
> IO(Decode16, 0x505, 0x505, 0x01, 0x01)
> })
> }
>
> Please test this with a QEMU option like "-M pc-1.4". The device should
> _not_ be detected if you're doing it right.
>

--
Gleb.

2013-03-06 08:46:57

by Hu Tao

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest

On Sun, Mar 03, 2013 at 11:17:38AM +0200, Gleb Natapov wrote:
> On Thu, Feb 28, 2013 at 08:13:10PM +0800, Hu Tao wrote:
> > This series implements a new interface, kvm pv event, to notify host when
> > some events happen in guest. Right now there is one supported event: guest
> > panic.
> >
> What other event do you have in mind? Is interface generic enough to
> accommodate future, yet unknown, events. It allows to pass only one
> integer specifying even type, what if additional info is needed? My be

guest crash, lockup, or warning.[1] But the first purpose is to do panic
notification(panic event). Since at the point the guest is panicked, I
think it's better to keep the interface as simple as possible.

[1] http://wiki.qemu.org/Features/PVCrashDetection

> stop pretending that device is generic and make it do once thing but do

you mean make the interface just do panic notification?

> it well? For generic even passing interface (whatever it may be needed
> for) much more powerful virtio should be used.
>
> On implementation itself I do not understand why is this kvm specific.
> The only thing that makes it so is that you hook device initialization
> into guest kvm initialization code, but this is obviously incorrect.
> What stops QEMU tcg or Xen from reusing the same device for the same
> purpose except the artificial limitation in a guest.
>
> Reading data from a random ioports is not how you discover platform
> devices in 21 century (and the data you read from unassigned port is not
> guarantied to be zero, it may depend on QEMU version), you use ACPI for
> that and Marcelo already pointed that to you. Having little knowledge of
> ACPI (we all do) is not a good reason to not doing it. We probably need
> to reserve QEMU specific ACPI Plug and Play hardware ID to define our own

Do we have to request the ID from some orgnazation?

> devices. After that you will be able to create device with _HID(QEMU0001)

QMU0001, I think. EISA ID requires it to have only 3 letters for PNP ID.

> in DSDT that supplies address information (ioport to use) and capability
> supported. Guest uses acpi_get_devices() to discover a platform device by
> its name (QEMU0001). Then you put the driver for the platform device
> into drivers/platform/x86/ and QEMU/kvm/Xen all will be able to use it.

Thanks for the information!

>
> On QEMU side of things I cannot comment much on how QOMified the device
> is (it should be), I hope other reviews will verify it, but I noticed
> that device is only initialized for PIIX, what about Q35?
>
> --
> Gleb.

2013-03-06 08:56:27

by Hu Tao

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest

Hi,

On Mon, Mar 04, 2013 at 11:05:37AM +0100, Paolo Bonzini wrote:
> Il 03/03/2013 10:17, Gleb Natapov ha scritto:
> > On Thu, Feb 28, 2013 at 08:13:10PM +0800, Hu Tao wrote:
> >> This series implements a new interface, kvm pv event, to notify host when
> >> some events happen in guest. Right now there is one supported event: guest
> >> panic.
> >>
> > What other event do you have in mind? Is interface generic enough to
> > accommodate future, yet unknown, events. It allows to pass only one
> > integer specifying even type, what if additional info is needed? My be
> > stop pretending that device is generic and make it do once thing but do
> > it well? For generic even passing interface (whatever it may be needed
> > for) much more powerful virtio should be used.
> >
> > On implementation itself I do not understand why is this kvm specific.
> > The only thing that makes it so is that you hook device initialization
> > into guest kvm initialization code, but this is obviously incorrect.
> > What stops QEMU tcg or Xen from reusing the same device for the same
> > purpose except the artificial limitation in a guest.
>
> Agreed.
>
> > Reading data from a random ioports is not how you discover platform
> > devices in 21 century (and the data you read from unassigned port is not
> > guarantied to be zero, it may depend on QEMU version), you use ACPI for
> > that and Marcelo already pointed that to you. Having little knowledge of
> > ACPI (we all do) is not a good reason to not doing it. We probably need
> > to reserve QEMU specific ACPI Plug and Play hardware ID to define our own
> > devices. After that you will be able to create device with _HID(QEMU0001)
> > in DSDT that supplies address information (ioport to use) and capability
> > supported.
>
> Please also document this HID in a new file in the QEMU docs/ directory.
>
> > Guest uses acpi_get_devices() to discover a platform device by
> > its name (QEMU0001). Then you put the driver for the platform device
> > into drivers/platform/x86/ and QEMU/kvm/Xen all will be able to use it.
>
> Just to clarify it for Hu Tao, the read from a random ioport is how the
> ACPI code will detect presence of the device.
>
> Something like this should work (in SeaBIOS's src/acpi-dsdt-isa.dsl):
>
> Device(PEVT) {
> Name(_HID, EisaId("QEMU0001"))
> OperationRegion(PEOR, SystemIO, 0x505, 0x01)
> Field(PEOR, ByteAcc, NoLock, Preserve) {
> PEPT, 8,
> }
>
> Method(_STA, 0, NotSerialized) {
> Store(PEPT, Local0)
> If (LEqual(Local0, Zero)) {
> Return (0x00)
> } Else {
> Return (0x0F)
> }
> }

IIUC, here _STA reads from ioport 0x505, if the result is 0, then the
device is not present. Otherwise, the device is present. But as Gleb
said, ''the data you read from unassigned port is not guarantied to be
zero, it may depend on QEMU version''. What should I do to tell if the
device is present or not correctly?

>
> Name(_CRS, ResourceTemplate() {
> IO(Decode16, 0x505, 0x505, 0x01, 0x01)
> })
> }
>
> Please test this with a QEMU option like "-M pc-1.4". The device should
> _not_ be detected if you're doing it right.

because there is no corresponding device driver?

>
> > On QEMU side of things I cannot comment much on how QOMified the device
> > is (it should be),
>
> Please make the device target-independent. It can be used on non-x86
> architectures that have I/O ports. You should make the port
> configurable using a property (DEFINE_PROPERTY_INT16 or something like
> that), with a default of 0x505.

OK.

>
> All the panicked_action is not necessary in my opinion. We have it for
> watchdogs, but that's really a legacy thing. Let libvirt do it, and
> always make the guest panic perform the PANICKED_PAUSE action.

OK.

>
> If you do it properly, a lot (really a lot) of code will go away.

Thanks!

>
> > I hope other reviews will verify it, but I noticed
> > that device is only initialized for PIIX, what about Q35?
>
> Yup.
>
> Paolo

2013-03-06 09:07:44

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest

Il 06/03/2013 09:56, Hu Tao ha scritto:
>> >
>> > Something like this should work (in SeaBIOS's src/acpi-dsdt-isa.dsl):
>> >
>> > Device(PEVT) {
>> > Name(_HID, EisaId("QEMU0001"))
>> > OperationRegion(PEOR, SystemIO, 0x505, 0x01)
>> > Field(PEOR, ByteAcc, NoLock, Preserve) {
>> > PEPT, 8,
>> > }
>> >
>> > Method(_STA, 0, NotSerialized) {
>> > Store(PEPT, Local0)
>> > If (LEqual(Local0, Zero)) {
>> > Return (0x00)
>> > } Else {
>> > Return (0x0F)
>> > }
>> > }
> IIUC, here _STA reads from ioport 0x505, if the result is 0, then the
> device is not present. Otherwise, the device is present. But as Gleb
> said, ''the data you read from unassigned port is not guarantied to be
> zero, it may depend on QEMU version''. What should I do to tell if the
> device is present or not correctly?

The firmware is tied to the QEMU version, so you can rely on unassigned
ports returning zero.

Later we can change this to use fw-cfg.

Paolo

2013-03-06 09:28:54

by Li Guang

[permalink] [raw]
Subject: Re: [Qemu-devel] [PATCH v13 0/8] pv event interface between host and guest

在 2013-03-06三的 10:07 +0100,Paolo Bonzini写道:
> Il 06/03/2013 09:56, Hu Tao ha scritto:
> >> >
> >> > Something like this should work (in SeaBIOS's src/acpi-dsdt-isa.dsl):
> >> >
> >> > Device(PEVT) {
> >> > Name(_HID, EisaId("QEMU0001"))
> >> > OperationRegion(PEOR, SystemIO, 0x505, 0x01)
> >> > Field(PEOR, ByteAcc, NoLock, Preserve) {
> >> > PEPT, 8,
> >> > }
> >> >
> >> > Method(_STA, 0, NotSerialized) {
> >> > Store(PEPT, Local0)
> >> > If (LEqual(Local0, Zero)) {
> >> > Return (0x00)
> >> > } Else {
> >> > Return (0x0F)
> >> > }
> >> > }
> > IIUC, here _STA reads from ioport 0x505, if the result is 0, then the
> > device is not present. Otherwise, the device is present. But as Gleb
> > said, ''the data you read from unassigned port is not guarantied to be
> > zero, it may depend on QEMU version''. What should I do to tell if the
> > device is present or not correctly?
>
> The firmware is tied to the QEMU version, so you can rely on unassigned
> ports returning zero.
>

but what if we happen to transfer data end by 0x0,
here, will this device(QEMU0001) disappear?
(by this asl code snippet, I think it will)
so, if we transfer data(not 0x0) device appear,
then, it will disappear(if come across 0x0),
can this happen?
or should we use another status io port e.g.
0x506 to handle this condition?

> Later we can change this to use fw-cfg.
>
> Paolo
>
>

2013-03-06 09:38:24

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest

On Wed, Mar 06, 2013 at 04:46:58PM +0800, Hu Tao wrote:
> On Sun, Mar 03, 2013 at 11:17:38AM +0200, Gleb Natapov wrote:
> > On Thu, Feb 28, 2013 at 08:13:10PM +0800, Hu Tao wrote:
> > > This series implements a new interface, kvm pv event, to notify host when
> > > some events happen in guest. Right now there is one supported event: guest
> > > panic.
> > >
> > What other event do you have in mind? Is interface generic enough to
> > accommodate future, yet unknown, events. It allows to pass only one
> > integer specifying even type, what if additional info is needed? My be
>
> guest crash, lockup, or warning.[1] But the first purpose is to do panic
> notification(panic event). Since at the point the guest is panicked, I
> think it's better to keep the interface as simple as possible.
>
> [1] http://wiki.qemu.org/Features/PVCrashDetection
>
> > stop pretending that device is generic and make it do once thing but do
>
> you mean make the interface just do panic notification?
>
Yes.

> > it well? For generic even passing interface (whatever it may be needed
> > for) much more powerful virtio should be used.
> >
> > On implementation itself I do not understand why is this kvm specific.
> > The only thing that makes it so is that you hook device initialization
> > into guest kvm initialization code, but this is obviously incorrect.
> > What stops QEMU tcg or Xen from reusing the same device for the same
> > purpose except the artificial limitation in a guest.
> >
> > Reading data from a random ioports is not how you discover platform
> > devices in 21 century (and the data you read from unassigned port is not
> > guarantied to be zero, it may depend on QEMU version), you use ACPI for
> > that and Marcelo already pointed that to you. Having little knowledge of
> > ACPI (we all do) is not a good reason to not doing it. We probably need
> > to reserve QEMU specific ACPI Plug and Play hardware ID to define our own
>
> Do we have to request the ID from some orgnazation?
>
A Plug and Play ID or ACPI ID can be obtained by sending e-mail to [email protected].

> > devices. After that you will be able to create device with _HID(QEMU0001)
>
> QMU0001, I think. EISA ID requires it to have only 3 letters for PNP ID.
Right, but I am not sure we need EISA ID here. Why not ACPI ID like in
the example from the spec:

Name (_HID, "MSFT0003") // Vendor-defined device

--
Gleb.

2013-03-06 09:39:50

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest

On Wed, Mar 06, 2013 at 10:07:31AM +0100, Paolo Bonzini wrote:
> Il 06/03/2013 09:56, Hu Tao ha scritto:
> >> >
> >> > Something like this should work (in SeaBIOS's src/acpi-dsdt-isa.dsl):
> >> >
> >> > Device(PEVT) {
> >> > Name(_HID, EisaId("QEMU0001"))
> >> > OperationRegion(PEOR, SystemIO, 0x505, 0x01)
> >> > Field(PEOR, ByteAcc, NoLock, Preserve) {
> >> > PEPT, 8,
> >> > }
> >> >
> >> > Method(_STA, 0, NotSerialized) {
> >> > Store(PEPT, Local0)
> >> > If (LEqual(Local0, Zero)) {
> >> > Return (0x00)
> >> > } Else {
> >> > Return (0x0F)
> >> > }
> >> > }
> > IIUC, here _STA reads from ioport 0x505, if the result is 0, then the
> > device is not present. Otherwise, the device is present. But as Gleb
> > said, ''the data you read from unassigned port is not guarantied to be
> > zero, it may depend on QEMU version''. What should I do to tell if the
> > device is present or not correctly?
>
> The firmware is tied to the QEMU version, so you can rely on unassigned
> ports returning zero.
>
> Later we can change this to use fw-cfg.
>
I thought we agreed to do it from the start :)

--
Gleb.

2013-03-06 09:49:25

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest


> On Wed, Mar 06, 2013 at 10:07:31AM +0100, Paolo Bonzini wrote:
> > Il 06/03/2013 09:56, Hu Tao ha scritto:
> > >> >
> > >> > Something like this should work (in SeaBIOS's
> > >> > src/acpi-dsdt-isa.dsl):
> > >> >
> > >> > Device(PEVT) {
> > >> > Name(_HID, EisaId("QEMU0001"))
> > >> > OperationRegion(PEOR, SystemIO, 0x505, 0x01)
> > >> > Field(PEOR, ByteAcc, NoLock, Preserve) {
> > >> > PEPT, 8,
> > >> > }
> > >> >
> > >> > Method(_STA, 0, NotSerialized) {
> > >> > Store(PEPT, Local0)
> > >> > If (LEqual(Local0, Zero)) {
> > >> > Return (0x00)
> > >> > } Else {
> > >> > Return (0x0F)
> > >> > }
> > >> > }
> > > IIUC, here _STA reads from ioport 0x505, if the result is 0, then the
> > > device is not present. Otherwise, the device is present. But as Gleb
> > > said, ''the data you read from unassigned port is not guarantied to be
> > > zero, it may depend on QEMU version''. What should I do to tell if the
> > > device is present or not correctly?
> >
> > The firmware is tied to the QEMU version, so you can rely on
> > unassigned ports returning zero.
> >
> > Later we can change this to use fw-cfg.
>
> I thought we agreed to do it from the start :)

Then Hu will need to patch the _STA method.

Paolo

2013-03-06 10:01:50

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH v13 0/8] pv event interface between host and guest

On Wed, Mar 06, 2013 at 04:48:17AM -0500, Paolo Bonzini wrote:
>
> > On Wed, Mar 06, 2013 at 10:07:31AM +0100, Paolo Bonzini wrote:
> > > Il 06/03/2013 09:56, Hu Tao ha scritto:
> > > >> >
> > > >> > Something like this should work (in SeaBIOS's
> > > >> > src/acpi-dsdt-isa.dsl):
> > > >> >
> > > >> > Device(PEVT) {
> > > >> > Name(_HID, EisaId("QEMU0001"))
> > > >> > OperationRegion(PEOR, SystemIO, 0x505, 0x01)
> > > >> > Field(PEOR, ByteAcc, NoLock, Preserve) {
> > > >> > PEPT, 8,
> > > >> > }
> > > >> >
> > > >> > Method(_STA, 0, NotSerialized) {
> > > >> > Store(PEPT, Local0)
> > > >> > If (LEqual(Local0, Zero)) {
> > > >> > Return (0x00)
> > > >> > } Else {
> > > >> > Return (0x0F)
> > > >> > }
> > > >> > }
> > > > IIUC, here _STA reads from ioport 0x505, if the result is 0, then the
> > > > device is not present. Otherwise, the device is present. But as Gleb
> > > > said, ''the data you read from unassigned port is not guarantied to be
> > > > zero, it may depend on QEMU version''. What should I do to tell if the
> > > > device is present or not correctly?
> > >
> > > The firmware is tied to the QEMU version, so you can rely on
> > > unassigned ports returning zero.
> > >
> > > Later we can change this to use fw-cfg.
> >
> > I thought we agreed to do it from the start :)
>
> Then Hu will need to patch the _STA method.
>
_STA and _CRS.

--
Gleb.