2012-05-06 15:21:07

by Xudong Hao

[permalink] [raw]
Subject: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest device assignment

Enable device LTR/OBFF capibility before do device assignment, so that guest can benefit from them.

Signed-off-by: Xudong Hao <[email protected]>
---
virt/kvm/iommu.c | 32 ++++++++++++++++++++++++++++++++
1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/iommu.c b/virt/kvm/iommu.c
index e9fff98..5139f2d 100644
--- a/virt/kvm/iommu.c
+++ b/virt/kvm/iommu.c
@@ -166,6 +166,10 @@ int kvm_assign_device(struct kvm *kvm,
if (pdev == NULL)
return -ENODEV;

+ /* Enable some device capibility before do device assignment,
+ * so that guest can benefit from them.
+ */
+ kvm_iommu_enable_dev_caps(pdev);
r = iommu_attach_device(domain, &pdev->dev);
if (r) {
printk(KERN_ERR "assign device %x:%x:%x.%x failed",
@@ -228,6 +232,7 @@ int kvm_deassign_device(struct kvm *kvm,
PCI_SLOT(assigned_dev->host_devfn),
PCI_FUNC(assigned_dev->host_devfn));

+ kvm_iommu_disable_dev_caps(pdev);
return 0;
}

@@ -351,3 +356,30 @@ int kvm_iommu_unmap_guest(struct kvm *kvm)
iommu_domain_free(domain);
return 0;
}
+
+static void kvm_iommu_enable_dev_caps(struct pci_dev *pdev)
+{
+ /* set default value */
+ unsigned long type = PCI_EXP_OBFF_SIGNAL_ALWAYS;
+ int snoop_lat_ns = 1024, nosnoop_lat_ns = 1024;
+
+ /* LTR(Latency tolerance reporting) allows devices to send
+ * messages to the root complex indicating their latency
+ * tolerance for snooped & unsnooped memory transactions.
+ */
+ pci_enable_ltr(pdev);
+ pci_set_ltr(pdev, snoop_lat_ns, nosnoop_lat_ns);
+
+ /* OBFF (optimized buffer flush/fill), where supported,
+ * can help improve energy efficiency by giving devices
+ * information about when interrupts and other activity
+ * will have a reduced power impact.
+ */
+ pci_enable_obff(pdev, type);
+}
+
+static void kvm_iommu_disable_dev_caps(struct pci_dev *pdev)
+{
+ pci_disble_obff(pdev);
+ pci_disble_ltr(pdev);
+}
--
1.6.0.rc1


2012-05-06 15:34:27

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest device assignment

On 05/06/2012 06:24 PM, Xudong Hao wrote:
> Enable device LTR/OBFF capibility before do device assignment, so that guest can benefit from them.

cc += Alex

> @@ -166,6 +166,10 @@ int kvm_assign_device(struct kvm *kvm,
> if (pdev == NULL)
> return -ENODEV;
>
> + /* Enable some device capibility before do device assignment,
> + * so that guest can benefit from them.
> + */
> + kvm_iommu_enable_dev_caps(pdev);
> r = iommu_attach_device(domain, &pdev->dev);

Suppose we fail here. Do we need to disable_dev_caps()?

> if (r) {
> printk(KERN_ERR "assign device %x:%x:%x.%x failed",
> @@ -228,6 +232,7 @@ int kvm_deassign_device(struct kvm *kvm,
> PCI_SLOT(assigned_dev->host_devfn),
> PCI_FUNC(assigned_dev->host_devfn));
>
> + kvm_iommu_disable_dev_caps(pdev);
> return 0;
> }
>
> @@ -351,3 +356,30 @@ int kvm_iommu_unmap_guest(struct kvm *kvm)
> iommu_domain_free(domain);
> return 0;
> }
> +
> +static void kvm_iommu_enable_dev_caps(struct pci_dev *pdev)
> +{
> + /* set default value */
> + unsigned long type = PCI_EXP_OBFF_SIGNAL_ALWAYS;
> + int snoop_lat_ns = 1024, nosnoop_lat_ns = 1024;

Where does this magic number come from?

> +
> + /* LTR(Latency tolerance reporting) allows devices to send
> + * messages to the root complex indicating their latency
> + * tolerance for snooped & unsnooped memory transactions.
> + */
> + pci_enable_ltr(pdev);
> + pci_set_ltr(pdev, snoop_lat_ns, nosnoop_lat_ns);
> +
> + /* OBFF (optimized buffer flush/fill), where supported,
> + * can help improve energy efficiency by giving devices
> + * information about when interrupts and other activity
> + * will have a reduced power impact.
> + */
> + pci_enable_obff(pdev, type);
> +}
> +
> +static void kvm_iommu_disable_dev_caps(struct pci_dev *pdev)
> +{
> + pci_disble_obff(pdev);
> + pci_disble_ltr(pdev);
> +}

Do we need to communicate something about these capabilities to the guest?

--
error compiling committee.c: too many arguments to function

2012-05-07 07:58:16

by Hao, Xudong

[permalink] [raw]
Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest device assignment

> -----Original Message-----
> From: Avi Kivity [mailto:[email protected]]
> Sent: Sunday, May 06, 2012 11:34 PM
> To: Xudong Hao
> Cc: [email protected]; [email protected]; [email protected];
> Zhang, Xiantao; Hao, Xudong; Alex Williamson
> Subject: Re: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest
> device assignment
>
> On 05/06/2012 06:24 PM, Xudong Hao wrote:
> > Enable device LTR/OBFF capibility before do device assignment, so that guest
> can benefit from them.
>
> cc += Alex
>
> > @@ -166,6 +166,10 @@ int kvm_assign_device(struct kvm *kvm,
> > if (pdev == NULL)
> > return -ENODEV;
> >
> > + /* Enable some device capibility before do device assignment,
> > + * so that guest can benefit from them.
> > + */
> > + kvm_iommu_enable_dev_caps(pdev);
> > r = iommu_attach_device(domain, &pdev->dev);
>
> Suppose we fail here. Do we need to disable_dev_caps()?
>
I don't think so. When a device will be assigned to guest, it's be owned by a pci-stub driver, attach_device fail here do not affect everything. If host want to use it, host device driver has its own enable/disable dev_caps.

> > if (r) {
> > printk(KERN_ERR "assign device %x:%x:%x.%x failed",
> > @@ -228,6 +232,7 @@ int kvm_deassign_device(struct kvm *kvm,
> > PCI_SLOT(assigned_dev->host_devfn),
> > PCI_FUNC(assigned_dev->host_devfn));
> >
> > + kvm_iommu_disable_dev_caps(pdev);
> > return 0;
> > }
> >
> > @@ -351,3 +356,30 @@ int kvm_iommu_unmap_guest(struct kvm *kvm)
> > iommu_domain_free(domain);
> > return 0;
> > }
> > +
> > +static void kvm_iommu_enable_dev_caps(struct pci_dev *pdev)
> > +{
> > + /* set default value */
> > + unsigned long type = PCI_EXP_OBFF_SIGNAL_ALWAYS;
> > + int snoop_lat_ns = 1024, nosnoop_lat_ns = 1024;
>
> Where does this magic number come from?
>
The number is the max value that register support, set it as default here, we did not have any device here, and we do not know what's the proper value, so it set a default value firstly.


> > +
> > + /* LTR(Latency tolerance reporting) allows devices to send
> > + * messages to the root complex indicating their latency
> > + * tolerance for snooped & unsnooped memory transactions.
> > + */
> > + pci_enable_ltr(pdev);
> > + pci_set_ltr(pdev, snoop_lat_ns, nosnoop_lat_ns);
> > +
> > + /* OBFF (optimized buffer flush/fill), where supported,
> > + * can help improve energy efficiency by giving devices
> > + * information about when interrupts and other activity
> > + * will have a reduced power impact.
> > + */
> > + pci_enable_obff(pdev, type);
> > +}
> > +
> > +static void kvm_iommu_disable_dev_caps(struct pci_dev *pdev)
> > +{
> > + pci_disble_obff(pdev);
> > + pci_disble_ltr(pdev);
> > +}
>
> Do we need to communicate something about these capabilities to the guest?
>

I guess you means that here host don't know if guest want to enable them, right?
The ltr/obff new feature are supposed to enabled by guest if platform and device supported.

> --
> error compiling committee.c: too many arguments to function

2012-05-07 16:16:29

by Alex Williamson

[permalink] [raw]
Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest device assignment

On Mon, 2012-05-07 at 07:58 +0000, Hao, Xudong wrote:
> > -----Original Message-----
> > From: Avi Kivity [mailto:[email protected]]
> > Sent: Sunday, May 06, 2012 11:34 PM
> > To: Xudong Hao
> > Cc: [email protected]; [email protected]; [email protected];
> > Zhang, Xiantao; Hao, Xudong; Alex Williamson
> > Subject: Re: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest
> > device assignment
> >
> > On 05/06/2012 06:24 PM, Xudong Hao wrote:
> > > Enable device LTR/OBFF capibility before do device assignment, so that guest
> > can benefit from them.
> >
> > cc += Alex
> >
> > > @@ -166,6 +166,10 @@ int kvm_assign_device(struct kvm *kvm,
> > > if (pdev == NULL)
> > > return -ENODEV;
> > >
> > > + /* Enable some device capibility before do device assignment,
> > > + * so that guest can benefit from them.
> > > + */
> > > + kvm_iommu_enable_dev_caps(pdev);
> > > r = iommu_attach_device(domain, &pdev->dev);
> >
> > Suppose we fail here. Do we need to disable_dev_caps()?
> >

If kvm_assign_device() fails we'll try to restore the state we saved in
kvm_vm_ioctl_assign_device(), so ltr/obff should be brought back to
initial state.

> I don't think so. When a device will be assigned to guest, it's be
> owned by a pci-stub driver, attach_device fail here do not affect
> everything. If host want to use it, host device driver has its own
> enable/disable dev_caps.

Why is device assignment unique here? If there's a default value that's
known to be safe, why doesn't pci_enable_device set it for everyone?
Host drivers can fine tune the value later if they want.

> > > if (r) {
> > > printk(KERN_ERR "assign device %x:%x:%x.%x failed",
> > > @@ -228,6 +232,7 @@ int kvm_deassign_device(struct kvm *kvm,
> > > PCI_SLOT(assigned_dev->host_devfn),
> > > PCI_FUNC(assigned_dev->host_devfn));
> > >
> > > + kvm_iommu_disable_dev_caps(pdev);
> > > return 0;
> > > }
> > >
> > > @@ -351,3 +356,30 @@ int kvm_iommu_unmap_guest(struct kvm *kvm)
> > > iommu_domain_free(domain);
> > > return 0;
> > > }
> > > +
> > > +static void kvm_iommu_enable_dev_caps(struct pci_dev *pdev)
> > > +{
> > > + /* set default value */
> > > + unsigned long type = PCI_EXP_OBFF_SIGNAL_ALWAYS;
> > > + int snoop_lat_ns = 1024, nosnoop_lat_ns = 1024;
> >
> > Where does this magic number come from?
> >
> The number is the max value that register support, set it as default
> here, we did not have any device here, and we do not know what's the
> proper value, so it set a default value firstly.

The register is composed of latency scale and latency value fields.
1024 is simply the largest value the latency value can hold (+1). The
scale field allows latencies up to 34,326,183,936ns to be specified, so
please explain how 1024 is a universal default.

> > > +
> > > + /* LTR(Latency tolerance reporting) allows devices to send
> > > + * messages to the root complex indicating their latency
> > > + * tolerance for snooped & unsnooped memory transactions.
> > > + */
> > > + pci_enable_ltr(pdev);
> > > + pci_set_ltr(pdev, snoop_lat_ns, nosnoop_lat_ns);
> > > +
> > > + /* OBFF (optimized buffer flush/fill), where supported,
> > > + * can help improve energy efficiency by giving devices
> > > + * information about when interrupts and other activity
> > > + * will have a reduced power impact.
> > > + */
> > > + pci_enable_obff(pdev, type);
> > > +}
> > > +
> > > +static void kvm_iommu_disable_dev_caps(struct pci_dev *pdev)
> > > +{
> > > + pci_disble_obff(pdev);
> > > + pci_disble_ltr(pdev);
> > > +}
> >
> > Do we need to communicate something about these capabilities to the guest?
> >
>
> I guess you means that here host don't know if guest want to enable them, right?
> The ltr/obff new feature are supposed to enabled by guest if platform and device supported.

It looks like ltr is a two part mechanism, the capability and enable
lives in the pci express capability, but the tuning registers live in
extended capability space. The guest doesn't yet have access to the
latter since we don't have an express chipset. The capability and
enable are read-only to the guest currently, same for obff. Thanks,

Alex

2012-05-08 09:16:40

by Hao, Xudong

[permalink] [raw]
Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest device assignment

> -----Original Message-----
> From: Alex Williamson [mailto:[email protected]]
> Sent: Tuesday, May 08, 2012 12:16 AM
> To: Hao, Xudong
> Cc: Avi Kivity; Xudong Hao; [email protected]; [email protected];
> [email protected]; Zhang, Xiantao
> Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest
> device assignment
>
> On Mon, 2012-05-07 at 07:58 +0000, Hao, Xudong wrote:
> > > -----Original Message-----
> > > From: Avi Kivity [mailto:[email protected]]
> > > Sent: Sunday, May 06, 2012 11:34 PM
> > > To: Xudong Hao
> > > Cc: [email protected]; [email protected];
> [email protected];
> > > Zhang, Xiantao; Hao, Xudong; Alex Williamson
> > > Subject: Re: [PATCH] kvm: Enable device LTR/OBFF capibility before doing
> guest
> > > device assignment
> > >
> > > On 05/06/2012 06:24 PM, Xudong Hao wrote:
> > > > Enable device LTR/OBFF capibility before do device assignment, so that
> guest
> > > can benefit from them.
> > >
> > > cc += Alex
> > >
> > > > @@ -166,6 +166,10 @@ int kvm_assign_device(struct kvm *kvm,
> > > > if (pdev == NULL)
> > > > return -ENODEV;
> > > >
> > > > + /* Enable some device capibility before do device assignment,
> > > > + * so that guest can benefit from them.
> > > > + */
> > > > + kvm_iommu_enable_dev_caps(pdev);
> > > > r = iommu_attach_device(domain, &pdev->dev);
> > >
> > > Suppose we fail here. Do we need to disable_dev_caps()?
> > >
>
> If kvm_assign_device() fails we'll try to restore the state we saved in
> kvm_vm_ioctl_assign_device(), so ltr/obff should be brought back to
> initial state.
>
Right, more clear.

> > I don't think so. When a device will be assigned to guest, it's be
> > owned by a pci-stub driver, attach_device fail here do not affect
> > everything. If host want to use it, host device driver has its own
> > enable/disable dev_caps.
>
> Why is device assignment unique here? If there's a default value that's
> known to be safe, why doesn't pci_enable_device set it for everyone?
> Host drivers can fine tune the value later if they want.
>
> > > > if (r) {
> > > > printk(KERN_ERR "assign device %x:%x:%x.%x failed",
> > > > @@ -228,6 +232,7 @@ int kvm_deassign_device(struct kvm *kvm,
> > > > PCI_SLOT(assigned_dev->host_devfn),
> > > > PCI_FUNC(assigned_dev->host_devfn));
> > > >
> > > > + kvm_iommu_disable_dev_caps(pdev);
> > > > return 0;
> > > > }
> > > >
> > > > @@ -351,3 +356,30 @@ int kvm_iommu_unmap_guest(struct kvm *kvm)
> > > > iommu_domain_free(domain);
> > > > return 0;
> > > > }
> > > > +
> > > > +static void kvm_iommu_enable_dev_caps(struct pci_dev *pdev)
> > > > +{
> > > > + /* set default value */
> > > > + unsigned long type = PCI_EXP_OBFF_SIGNAL_ALWAYS;
> > > > + int snoop_lat_ns = 1024, nosnoop_lat_ns = 1024;
> > >
> > > Where does this magic number come from?
> > >
> > The number is the max value that register support, set it as default
> > here, we did not have any device here, and we do not know what's the
> > proper value, so it set a default value firstly.
>
> The register is composed of latency scale and latency value fields.
> 1024 is simply the largest value the latency value can hold (+1). The
> scale field allows latencies up to 34,326,183,936ns to be specified, so
> please explain how 1024 is a universal default.
>

Since each platform will have its own max supported latency, I think the best way is setting the value to 0 because we have such a device now.

> > > > +
> > > > + /* LTR(Latency tolerance reporting) allows devices to send
> > > > + * messages to the root complex indicating their latency
> > > > + * tolerance for snooped & unsnooped memory transactions.
> > > > + */
> > > > + pci_enable_ltr(pdev);
> > > > + pci_set_ltr(pdev, snoop_lat_ns, nosnoop_lat_ns);
> > > > +
> > > > + /* OBFF (optimized buffer flush/fill), where supported,
> > > > + * can help improve energy efficiency by giving devices
> > > > + * information about when interrupts and other activity
> > > > + * will have a reduced power impact.
> > > > + */
> > > > + pci_enable_obff(pdev, type);
> > > > +}
> > > > +
> > > > +static void kvm_iommu_disable_dev_caps(struct pci_dev *pdev)
> > > > +{
> > > > + pci_disble_obff(pdev);
> > > > + pci_disble_ltr(pdev);
> > > > +}
> > >
> > > Do we need to communicate something about these capabilities to the
> guest?
> > >
> >
> > I guess you means that here host don't know if guest want to enable them,
> right?
> > The ltr/obff new feature are supposed to enabled by guest if platform and
> device supported.
>
> It looks like ltr is a two part mechanism, the capability and enable
> lives in the pci express capability, but the tuning registers live in
> extended capability space. The guest doesn't yet have access to the
> latter since we don't have an express chipset. The capability and
> enable are read-only to the guest currently, same for obff. Thanks,
>
> Alex

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2012-05-08 15:18:26

by Alex Williamson

[permalink] [raw]
Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest device assignment

On Tue, 2012-05-08 at 09:16 +0000, Hao, Xudong wrote:
> > -----Original Message-----
> > From: Alex Williamson [mailto:[email protected]]
> > Sent: Tuesday, May 08, 2012 12:16 AM
> > To: Hao, Xudong
> > Cc: Avi Kivity; Xudong Hao; [email protected]; [email protected];
> > [email protected]; Zhang, Xiantao
> > Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest
> > device assignment
> >
> > On Mon, 2012-05-07 at 07:58 +0000, Hao, Xudong wrote:
> > > > -----Original Message-----
> > > > From: Avi Kivity [mailto:[email protected]]
> > > > Sent: Sunday, May 06, 2012 11:34 PM
> > > > To: Xudong Hao
> > > > Cc: [email protected]; [email protected];
> > [email protected];
> > > > Zhang, Xiantao; Hao, Xudong; Alex Williamson
> > > > Subject: Re: [PATCH] kvm: Enable device LTR/OBFF capibility before doing
> > guest
> > > > device assignment
> > > >
> > > > On 05/06/2012 06:24 PM, Xudong Hao wrote:
> > > > > Enable device LTR/OBFF capibility before do device assignment, so that
> > guest
> > > > can benefit from them.
> > > >
> > > > cc += Alex
> > > >
> > > > > @@ -166,6 +166,10 @@ int kvm_assign_device(struct kvm *kvm,
> > > > > if (pdev == NULL)
> > > > > return -ENODEV;
> > > > >
> > > > > + /* Enable some device capibility before do device assignment,
> > > > > + * so that guest can benefit from them.
> > > > > + */
> > > > > + kvm_iommu_enable_dev_caps(pdev);
> > > > > r = iommu_attach_device(domain, &pdev->dev);
> > > >
> > > > Suppose we fail here. Do we need to disable_dev_caps()?
> > > >
> >
> > If kvm_assign_device() fails we'll try to restore the state we saved in
> > kvm_vm_ioctl_assign_device(), so ltr/obff should be brought back to
> > initial state.
> >
> Right, more clear.
>
> > > I don't think so. When a device will be assigned to guest, it's be
> > > owned by a pci-stub driver, attach_device fail here do not affect
> > > everything. If host want to use it, host device driver has its own
> > > enable/disable dev_caps.
> >
> > Why is device assignment unique here? If there's a default value that's
> > known to be safe, why doesn't pci_enable_device set it for everyone?
> > Host drivers can fine tune the value later if they want.
> >
> > > > > if (r) {
> > > > > printk(KERN_ERR "assign device %x:%x:%x.%x failed",
> > > > > @@ -228,6 +232,7 @@ int kvm_deassign_device(struct kvm *kvm,
> > > > > PCI_SLOT(assigned_dev->host_devfn),
> > > > > PCI_FUNC(assigned_dev->host_devfn));
> > > > >
> > > > > + kvm_iommu_disable_dev_caps(pdev);
> > > > > return 0;
> > > > > }
> > > > >
> > > > > @@ -351,3 +356,30 @@ int kvm_iommu_unmap_guest(struct kvm *kvm)
> > > > > iommu_domain_free(domain);
> > > > > return 0;
> > > > > }
> > > > > +
> > > > > +static void kvm_iommu_enable_dev_caps(struct pci_dev *pdev)
> > > > > +{
> > > > > + /* set default value */
> > > > > + unsigned long type = PCI_EXP_OBFF_SIGNAL_ALWAYS;
> > > > > + int snoop_lat_ns = 1024, nosnoop_lat_ns = 1024;
> > > >
> > > > Where does this magic number come from?
> > > >
> > > The number is the max value that register support, set it as default
> > > here, we did not have any device here, and we do not know what's the
> > > proper value, so it set a default value firstly.
> >
> > The register is composed of latency scale and latency value fields.
> > 1024 is simply the largest value the latency value can hold (+1). The
> > scale field allows latencies up to 34,326,183,936ns to be specified, so
> > please explain how 1024 is a universal default.
> >
>
> Since each platform will have its own max supported latency, I think
> the best way is setting the value to 0 because we have such a device
> now.

What's the benefit to that device vs the risk to other devices? Again,
if there's a safe default value for both LTR and OBFF, why isn't PCI
core setting it for everyone? I'm inclined to wait for qemu express
support and expose LTR/OBFF control to the guest if and only if we can
enable it on the root complex and intermediate switches. Thanks,

Alex

> > > > > +
> > > > > + /* LTR(Latency tolerance reporting) allows devices to send
> > > > > + * messages to the root complex indicating their latency
> > > > > + * tolerance for snooped & unsnooped memory transactions.
> > > > > + */
> > > > > + pci_enable_ltr(pdev);
> > > > > + pci_set_ltr(pdev, snoop_lat_ns, nosnoop_lat_ns);
> > > > > +
> > > > > + /* OBFF (optimized buffer flush/fill), where supported,
> > > > > + * can help improve energy efficiency by giving devices
> > > > > + * information about when interrupts and other activity
> > > > > + * will have a reduced power impact.
> > > > > + */
> > > > > + pci_enable_obff(pdev, type);
> > > > > +}
> > > > > +
> > > > > +static void kvm_iommu_disable_dev_caps(struct pci_dev *pdev)
> > > > > +{
> > > > > + pci_disble_obff(pdev);
> > > > > + pci_disble_ltr(pdev);
> > > > > +}
> > > >
> > > > Do we need to communicate something about these capabilities to the
> > guest?
> > > >
> > >
> > > I guess you means that here host don't know if guest want to enable them,
> > right?
> > > The ltr/obff new feature are supposed to enabled by guest if platform and
> > device supported.
> >
> > It looks like ltr is a two part mechanism, the capability and enable
> > lives in the pci express capability, but the tuning registers live in
> > extended capability space. The guest doesn't yet have access to the
> > latter since we don't have an express chipset. The capability and
> > enable are read-only to the guest currently, same for obff. Thanks,
> >
> > Alex
>


2012-05-09 01:18:50

by Hao, Xudong

[permalink] [raw]
Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest device assignment

> -----Original Message-----
> From: Alex Williamson [mailto:[email protected]]
> Sent: Tuesday, May 08, 2012 11:18 PM
> To: Hao, Xudong
> Cc: Avi Kivity; Xudong Hao; [email protected]; [email protected];
> [email protected]; Zhang, Xiantao
> Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest
> device assignment
>
> On Tue, 2012-05-08 at 09:16 +0000, Hao, Xudong wrote:
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:[email protected]]
> > > Sent: Tuesday, May 08, 2012 12:16 AM
> > > To: Hao, Xudong
> > > Cc: Avi Kivity; Xudong Hao; [email protected]; [email protected];
> > > [email protected]; Zhang, Xiantao
> > > Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing
> guest
> > > device assignment
> > >
> > > On Mon, 2012-05-07 at 07:58 +0000, Hao, Xudong wrote:
> > > > > -----Original Message-----
> > > > > From: Avi Kivity [mailto:[email protected]]
> > > > > Sent: Sunday, May 06, 2012 11:34 PM
> > > > > To: Xudong Hao
> > > > > Cc: [email protected]; [email protected];
> > > [email protected];
> > > > > Zhang, Xiantao; Hao, Xudong; Alex Williamson
> > > > > Subject: Re: [PATCH] kvm: Enable device LTR/OBFF capibility before doing
> > > guest
> > > > > device assignment
> > > > >
> > > > > On 05/06/2012 06:24 PM, Xudong Hao wrote:
> > > > > > Enable device LTR/OBFF capibility before do device assignment, so that
> > > guest
> > > > > can benefit from them.
> > > > >
> > > > > cc += Alex
> > > > >
> > > > > > @@ -166,6 +166,10 @@ int kvm_assign_device(struct kvm *kvm,
> > > > > > if (pdev == NULL)
> > > > > > return -ENODEV;
> > > > > >
> > > > > > + /* Enable some device capibility before do device assignment,
> > > > > > + * so that guest can benefit from them.
> > > > > > + */
> > > > > > + kvm_iommu_enable_dev_caps(pdev);
> > > > > > r = iommu_attach_device(domain, &pdev->dev);
> > > > >
> > > > > Suppose we fail here. Do we need to disable_dev_caps()?
> > > > >
> > >
> > > If kvm_assign_device() fails we'll try to restore the state we saved in
> > > kvm_vm_ioctl_assign_device(), so ltr/obff should be brought back to
> > > initial state.
> > >
> > Right, more clear.
> >
> > > > I don't think so. When a device will be assigned to guest, it's be
> > > > owned by a pci-stub driver, attach_device fail here do not affect
> > > > everything. If host want to use it, host device driver has its own
> > > > enable/disable dev_caps.
> > >
> > > Why is device assignment unique here? If there's a default value that's
> > > known to be safe, why doesn't pci_enable_device set it for everyone?
> > > Host drivers can fine tune the value later if they want.
> > >

If host did not have this device driver or host did not load the driver, who will enable them? Guest? But in guest, it really need qemu PCIe support.

> > > > > > if (r) {
> > > > > > printk(KERN_ERR "assign device %x:%x:%x.%x failed",
> > > > > > @@ -228,6 +232,7 @@ int kvm_deassign_device(struct kvm *kvm,
> > > > > > PCI_SLOT(assigned_dev->host_devfn),
> > > > > > PCI_FUNC(assigned_dev->host_devfn));
> > > > > >
> > > > > > + kvm_iommu_disable_dev_caps(pdev);
> > > > > > return 0;
> > > > > > }
> > > > > >
> > > > > > @@ -351,3 +356,30 @@ int kvm_iommu_unmap_guest(struct kvm
> *kvm)
> > > > > > iommu_domain_free(domain);
> > > > > > return 0;
> > > > > > }
> > > > > > +
> > > > > > +static void kvm_iommu_enable_dev_caps(struct pci_dev *pdev)
> > > > > > +{
> > > > > > + /* set default value */
> > > > > > + unsigned long type = PCI_EXP_OBFF_SIGNAL_ALWAYS;
> > > > > > + int snoop_lat_ns = 1024, nosnoop_lat_ns = 1024;
> > > > >
> > > > > Where does this magic number come from?
> > > > >
> > > > The number is the max value that register support, set it as default
> > > > here, we did not have any device here, and we do not know what's the
> > > > proper value, so it set a default value firstly.
> > >
> > > The register is composed of latency scale and latency value fields.
> > > 1024 is simply the largest value the latency value can hold (+1). The
> > > scale field allows latencies up to 34,326,183,936ns to be specified, so
> > > please explain how 1024 is a universal default.
> > >
> >
> > Since each platform will have its own max supported latency, I think
> > the best way is setting the value to 0 because we have such a device
> > now.
>
> What's the benefit to that device vs the risk to other devices?

Default value 0 does not affect any device, right?

> Again,
> if there's a safe default value for both LTR and OBFF, why isn't PCI
> core setting it for everyone? I'm inclined to wait for qemu express
> support and expose LTR/OBFF control to the guest if and only if we can
> enable it on the root complex and intermediate switches. Thanks,
>

Alex, do you means you're working on the qemu express support and ltr/obff exposing? If so, when will this support finish?

Thanks
Xudong


????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2012-05-09 02:34:06

by Alex Williamson

[permalink] [raw]
Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest device assignment

On Wed, 2012-05-09 at 01:18 +0000, Hao, Xudong wrote:
> > -----Original Message-----
> > From: Alex Williamson [mailto:[email protected]]
> > Sent: Tuesday, May 08, 2012 11:18 PM
> > To: Hao, Xudong
> > Cc: Avi Kivity; Xudong Hao; [email protected]; [email protected];
> > [email protected]; Zhang, Xiantao
> > Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest
> > device assignment
> >
> > On Tue, 2012-05-08 at 09:16 +0000, Hao, Xudong wrote:
> > > > -----Original Message-----
> > > > From: Alex Williamson [mailto:[email protected]]
> > > > Sent: Tuesday, May 08, 2012 12:16 AM
> > > > To: Hao, Xudong
> > > > Cc: Avi Kivity; Xudong Hao; [email protected]; [email protected];
> > > > [email protected]; Zhang, Xiantao
> > > > Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing
> > guest
> > > > device assignment
> > > >
> > > > On Mon, 2012-05-07 at 07:58 +0000, Hao, Xudong wrote:
> > > > > > -----Original Message-----
> > > > > > From: Avi Kivity [mailto:[email protected]]
> > > > > > Sent: Sunday, May 06, 2012 11:34 PM
> > > > > > To: Xudong Hao
> > > > > > Cc: [email protected]; [email protected];
> > > > [email protected];
> > > > > > Zhang, Xiantao; Hao, Xudong; Alex Williamson
> > > > > > Subject: Re: [PATCH] kvm: Enable device LTR/OBFF capibility before doing
> > > > guest
> > > > > > device assignment
> > > > > >
> > > > > > On 05/06/2012 06:24 PM, Xudong Hao wrote:
> > > > > > > Enable device LTR/OBFF capibility before do device assignment, so that
> > > > guest
> > > > > > can benefit from them.
> > > > > >
> > > > > > cc += Alex
> > > > > >
> > > > > > > @@ -166,6 +166,10 @@ int kvm_assign_device(struct kvm *kvm,
> > > > > > > if (pdev == NULL)
> > > > > > > return -ENODEV;
> > > > > > >
> > > > > > > + /* Enable some device capibility before do device assignment,
> > > > > > > + * so that guest can benefit from them.
> > > > > > > + */
> > > > > > > + kvm_iommu_enable_dev_caps(pdev);
> > > > > > > r = iommu_attach_device(domain, &pdev->dev);
> > > > > >
> > > > > > Suppose we fail here. Do we need to disable_dev_caps()?
> > > > > >
> > > >
> > > > If kvm_assign_device() fails we'll try to restore the state we saved in
> > > > kvm_vm_ioctl_assign_device(), so ltr/obff should be brought back to
> > > > initial state.
> > > >
> > > Right, more clear.
> > >
> > > > > I don't think so. When a device will be assigned to guest, it's be
> > > > > owned by a pci-stub driver, attach_device fail here do not affect
> > > > > everything. If host want to use it, host device driver has its own
> > > > > enable/disable dev_caps.
> > > >
> > > > Why is device assignment unique here? If there's a default value that's
> > > > known to be safe, why doesn't pci_enable_device set it for everyone?
> > > > Host drivers can fine tune the value later if they want.
> > > >
>
> If host did not have this device driver or host did not load the
> driver, who will enable them? Guest? But in guest, it really need qemu
> PCIe support.

The kvm driver does pci_enable_device(), just like any other PCI driver.
If there's a safe default value, why isn't it set there?

> > > > > > > if (r) {
> > > > > > > printk(KERN_ERR "assign device %x:%x:%x.%x failed",
> > > > > > > @@ -228,6 +232,7 @@ int kvm_deassign_device(struct kvm *kvm,
> > > > > > > PCI_SLOT(assigned_dev->host_devfn),
> > > > > > > PCI_FUNC(assigned_dev->host_devfn));
> > > > > > >
> > > > > > > + kvm_iommu_disable_dev_caps(pdev);
> > > > > > > return 0;
> > > > > > > }
> > > > > > >
> > > > > > > @@ -351,3 +356,30 @@ int kvm_iommu_unmap_guest(struct kvm
> > *kvm)
> > > > > > > iommu_domain_free(domain);
> > > > > > > return 0;
> > > > > > > }
> > > > > > > +
> > > > > > > +static void kvm_iommu_enable_dev_caps(struct pci_dev *pdev)
> > > > > > > +{
> > > > > > > + /* set default value */
> > > > > > > + unsigned long type = PCI_EXP_OBFF_SIGNAL_ALWAYS;
> > > > > > > + int snoop_lat_ns = 1024, nosnoop_lat_ns = 1024;
> > > > > >
> > > > > > Where does this magic number come from?
> > > > > >
> > > > > The number is the max value that register support, set it as default
> > > > > here, we did not have any device here, and we do not know what's the
> > > > > proper value, so it set a default value firstly.
> > > >
> > > > The register is composed of latency scale and latency value fields.
> > > > 1024 is simply the largest value the latency value can hold (+1). The
> > > > scale field allows latencies up to 34,326,183,936ns to be specified, so
> > > > please explain how 1024 is a universal default.
> > > >
> > >
> > > Since each platform will have its own max supported latency, I think
> > > the best way is setting the value to 0 because we have such a device
> > > now.
> >
> > What's the benefit to that device vs the risk to other devices?
>
> Default value 0 does not affect any device, right?

I don't know, but if it doesn't affect any device, why bother? On the
risk side we have to question whether the device ltr/obff support works,
whether the values we use are appropriate, whether the upstream
interconnects support the same, and work, and whether the guest driver
will behave appropriately with these enabled versus a gain of what? So
far it looks like we're turning it on simply because it's there.

> > Again,
> > if there's a safe default value for both LTR and OBFF, why isn't PCI
> > core setting it for everyone? I'm inclined to wait for qemu express
> > support and expose LTR/OBFF control to the guest if and only if we can
> > enable it on the root complex and intermediate switches. Thanks,
> >
>
> Alex, do you means you're working on the qemu express support and
> ltr/obff exposing? If so, when will this support finish?

Qemu express support is being worked on by the community, once
available, it may be simple to expose these register if we determine
it's safe for the guest to manipulate them. I think there's a goal of
incorporating express support by qemu 1.2, but I'm not driving it.
Thanks,

Alex

2012-05-13 07:04:36

by Hao, Xudong

[permalink] [raw]
Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest device assignment

> -----Original Message-----
> From: Alex Williamson [mailto:[email protected]]
> Sent: Wednesday, May 09, 2012 10:34 AM
> To: Hao, Xudong
> Cc: Avi Kivity; Xudong Hao; [email protected]; [email protected];
> [email protected]; Zhang, Xiantao
> Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing guest
> device assignment
>
> On Wed, 2012-05-09 at 01:18 +0000, Hao, Xudong wrote:
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:[email protected]]
> > > Sent: Tuesday, May 08, 2012 11:18 PM
> > > To: Hao, Xudong
> > > Cc: Avi Kivity; Xudong Hao; [email protected]; [email protected];
> > > [email protected]; Zhang, Xiantao
> > > Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing
> guest
> > > device assignment
> > >
> > > On Tue, 2012-05-08 at 09:16 +0000, Hao, Xudong wrote:
> > > > > -----Original Message-----
> > > > > From: Alex Williamson [mailto:[email protected]]
> > > > > Sent: Tuesday, May 08, 2012 12:16 AM
> > > > > To: Hao, Xudong
> > > > > Cc: Avi Kivity; Xudong Hao; [email protected];
> [email protected];
> > > > > [email protected]; Zhang, Xiantao
> > > > > Subject: RE: [PATCH] kvm: Enable device LTR/OBFF capibility before doing
> > > guest
> > > > > device assignment
> > > > >
> > > > > On Mon, 2012-05-07 at 07:58 +0000, Hao, Xudong wrote:
> > > > > > > -----Original Message-----
> > > > > > > From: Avi Kivity [mailto:[email protected]]
> > > > > > > Sent: Sunday, May 06, 2012 11:34 PM
> > > > > > > To: Xudong Hao
> > > > > > > Cc: [email protected]; [email protected];
> > > > > [email protected];
> > > > > > > Zhang, Xiantao; Hao, Xudong; Alex Williamson
> > > > > > > Subject: Re: [PATCH] kvm: Enable device LTR/OBFF capibility before
> doing
> > > > > guest
> > > > > > > device assignment
> > > > > > >
> > > > > > > On 05/06/2012 06:24 PM, Xudong Hao wrote:
> > > > > > > > Enable device LTR/OBFF capibility before do device assignment, so
> that
> > > > > guest
> > > > > > > can benefit from them.
> > > > > > >
> > > > > > > cc += Alex
> > > > > > >
> > > > > > > > @@ -166,6 +166,10 @@ int kvm_assign_device(struct kvm *kvm,
> > > > > > > > if (pdev == NULL)
> > > > > > > > return -ENODEV;
> > > > > > > >
> > > > > > > > + /* Enable some device capibility before do device assignment,
> > > > > > > > + * so that guest can benefit from them.
> > > > > > > > + */
> > > > > > > > + kvm_iommu_enable_dev_caps(pdev);
> > > > > > > > r = iommu_attach_device(domain, &pdev->dev);
> > > > > > >
> > > > > > > Suppose we fail here. Do we need to disable_dev_caps()?
> > > > > > >
> > > > >
> > > > > If kvm_assign_device() fails we'll try to restore the state we saved in
> > > > > kvm_vm_ioctl_assign_device(), so ltr/obff should be brought back to
> > > > > initial state.
> > > > >
> > > > Right, more clear.
> > > >
> > > > > > I don't think so. When a device will be assigned to guest, it's be
> > > > > > owned by a pci-stub driver, attach_device fail here do not affect
> > > > > > everything. If host want to use it, host device driver has its own
> > > > > > enable/disable dev_caps.
> > > > >
> > > > > Why is device assignment unique here? If there's a default value that's
> > > > > known to be safe, why doesn't pci_enable_device set it for everyone?
> > > > > Host drivers can fine tune the value later if they want.
> > > > >
> >
> > If host did not have this device driver or host did not load the
> > driver, who will enable them? Guest? But in guest, it really need qemu
> > PCIe support.
>
> The kvm driver does pci_enable_device(), just like any other PCI driver.
> If there's a safe default value, why isn't it set there?
>

OK, I saw it. Seems it's reasonable to enable them in pci_enable_device(). I'll re-write patches there.

> > > > > > > > if (r) {
> > > > > > > > printk(KERN_ERR "assign device %x:%x:%x.%x failed",
> > > > > > > > @@ -228,6 +232,7 @@ int kvm_deassign_device(struct kvm
> *kvm,
> > > > > > > > PCI_SLOT(assigned_dev->host_devfn),
> > > > > > > > PCI_FUNC(assigned_dev->host_devfn));
> > > > > > > >
> > > > > > > > + kvm_iommu_disable_dev_caps(pdev);
> > > > > > > > return 0;
> > > > > > > > }
> > > > > > > >
> > > > > > > > @@ -351,3 +356,30 @@ int kvm_iommu_unmap_guest(struct
> kvm
> > > *kvm)
> > > > > > > > iommu_domain_free(domain);
> > > > > > > > return 0;
> > > > > > > > }
> > > > > > > > +
> > > > > > > > +static void kvm_iommu_enable_dev_caps(struct pci_dev *pdev)
> > > > > > > > +{
> > > > > > > > + /* set default value */
> > > > > > > > + unsigned long type = PCI_EXP_OBFF_SIGNAL_ALWAYS;
> > > > > > > > + int snoop_lat_ns = 1024, nosnoop_lat_ns = 1024;
> > > > > > >
> > > > > > > Where does this magic number come from?
> > > > > > >
> > > > > > The number is the max value that register support, set it as default
> > > > > > here, we did not have any device here, and we do not know what's the
> > > > > > proper value, so it set a default value firstly.
> > > > >
> > > > > The register is composed of latency scale and latency value fields.
> > > > > 1024 is simply the largest value the latency value can hold (+1). The
> > > > > scale field allows latencies up to 34,326,183,936ns to be specified, so
> > > > > please explain how 1024 is a universal default.
> > > > >
> > > >
> > > > Since each platform will have its own max supported latency, I think
> > > > the best way is setting the value to 0 because we have such a device
> > > > now.
> > >
> > > What's the benefit to that device vs the risk to other devices?
> >
> > Default value 0 does not affect any device, right?
>
> I don't know, but if it doesn't affect any device, why bother? On the
> risk side we have to question whether the device ltr/obff support works,
> whether the values we use are appropriate, whether the upstream
> interconnects support the same, and work, and whether the guest driver
> will behave appropriately with these enabled versus a gain of what? So
> far it looks like we're turning it on simply because it's there.
>
> > > Again,
> > > if there's a safe default value for both LTR and OBFF, why isn't PCI
> > > core setting it for everyone? I'm inclined to wait for qemu express
> > > support and expose LTR/OBFF control to the guest if and only if we can
> > > enable it on the root complex and intermediate switches. Thanks,
> > >
> >
> > Alex, do you means you're working on the qemu express support and
> > ltr/obff exposing? If so, when will this support finish?
>
> Qemu express support is being worked on by the community, once
> available, it may be simple to expose these register if we determine
> it's safe for the guest to manipulate them. I think there's a goal of
> incorporating express support by qemu 1.2, but I'm not driving it.
> Thanks,
>
> Alex

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?