LinuxLists.cc - [PATCH v2] iommu/intel: Exclude devices using RMRRs from IOMMU API domains

2014-06-13 16:31:19

Subject: [PATCH v2] iommu/intel: Exclude devices using RMRRs from IOMMU API domains

The user of the IOMMU API domain expects to have full control of
the IOVA space for the domain. RMRRs are fundamentally incompatible
with that idea. We can neither map the RMRR into the IOMMU API
domain, nor can we guarantee that the device won't continue DMA with
the area described by the RMRR as part of the new domain. Therefore
we must prevent such devices from being used by the IOMMU API.

Signed-off-by: Alex Williamson <[email protected]>
Cc: [email protected]
---

v2: consolidate test to a single, well documented function.

drivers/iommu/intel-iommu.c | 49 ++++++++++++++++++++++++++++++++++---------
1 file changed, 39 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index c4f11c0..253d598 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2511,22 +2511,46 @@ static bool device_has_rmrr(struct device *dev)
return false;
}

+/*
+ * There are a couple cases where we need to restrict the functionality of
+ * devices associated with RMRRs. The first is when evaluating a device for
+ * identity mapping because problems exist when devices are moved in and out
+ * of domains and their respective RMRR information is lost. This means that
+ * a device with associated RMRRs will never be in a "passthrough" domain.
+ * The second is use of the device through the IOMMU API. This interface
+ * expects to have full control of the IOVA space for the device. We cannot
+ * satisfy both the requirement that RMRR access is maintained and have an
+ * unencumbered IOVA space. We also have no ability to quiesce the device's
+ * use of the RMRR space or even inform the IOMMU API user of the restriction.
+ * We therefore prevent devices associated with an RMRR from participating in
+ * the IOMMU API, which eliminates them from device assignment.
+ *
+ * In both cases we assume that PCI USB devices with RMRRs have them largely
+ * for historical reasons and that the RMRR space is not actively used post
+ * boot. This exclusion may change if vendors begin to abuse it.
+ */
+static bool device_is_rmrr_locked(struct device *dev)
+{
+ if (!device_has_rmrr(dev))
+ return false;
+
+ if (dev_is_pci(dev)) {
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ if ((pdev->class >> 8) == PCI_CLASS_SERIAL_USB)
+ return false;
+ }
+
+ return true;
+}
+
static int iommu_should_identity_map(struct device *dev, int startup)
{

if (dev_is_pci(dev)) {
struct pci_dev *pdev = to_pci_dev(dev);

- /*
- * We want to prevent any device associated with an RMRR from
- * getting placed into the SI Domain. This is done because
- * problems exist when devices are moved in and out of domains
- * and their respective RMRR info is lost. We exempt USB devices
- * from this process due to their usage of RMRRs that are known
- * to not be needed after BIOS hand-off to OS.
- */
- if (device_has_rmrr(dev) &&
- (pdev->class >> 8) != PCI_CLASS_SERIAL_USB)
+ if (device_is_rmrr_locked(dev))
return 0;

if ((iommu_identity_mapping & IDENTMAP_AZALIA) && IS_AZALIA(pdev))
@@ -4171,6 +4195,11 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
int addr_width;
u8 bus, devfn;

+ if (device_is_rmrr_locked(dev)) {
+ dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement. Contact your platform vendor.\n");
+ return -EPERM;
+ }
+
/* normally dev is not mapped */
if (unlikely(domain_context_mapped(dev))) {
struct dmar_domain *old_domain;

2014-06-17 05:35:31

by Alex Williamson

[permalink] [raw]

Subject: Re: [PATCH v2] iommu/intel: Exclude devices using RMRRs from IOMMU API domains

On Fri, 2014-06-13 at 10:30 -0600, Alex Williamson wrote:
> The user of the IOMMU API domain expects to have full control of
> the IOVA space for the domain. RMRRs are fundamentally incompatible
> with that idea. We can neither map the RMRR into the IOMMU API
> domain, nor can we guarantee that the device won't continue DMA with
> the area described by the RMRR as part of the new domain. Therefore
> we must prevent such devices from being used by the IOMMU API.
>
> Signed-off-by: Alex Williamson <[email protected]>
> Cc: [email protected]
> ---

David,

Any idea what an off-the-shelf Asus motherboard would be doing with an
RMRR on the Intel HD graphics?

dmar: RMRR base: 0x000000bb800000 end: 0x000000bf9fffff
IOMMU: Setting identity map for device 0000:00:02.0 [0xbb800000 - 0xbf9fffff]

Thanks,

Alex

> v2: consolidate test to a single, well documented function.
>
> drivers/iommu/intel-iommu.c | 49 ++++++++++++++++++++++++++++++++++---------
> 1 file changed, 39 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index c4f11c0..253d598 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2511,22 +2511,46 @@ static bool device_has_rmrr(struct device *dev)
> return false;
> }
>
> +/*
> + * There are a couple cases where we need to restrict the functionality of
> + * devices associated with RMRRs. The first is when evaluating a device for
> + * identity mapping because problems exist when devices are moved in and out
> + * of domains and their respective RMRR information is lost. This means that
> + * a device with associated RMRRs will never be in a "passthrough" domain.
> + * The second is use of the device through the IOMMU API. This interface
> + * expects to have full control of the IOVA space for the device. We cannot
> + * satisfy both the requirement that RMRR access is maintained and have an
> + * unencumbered IOVA space. We also have no ability to quiesce the device's
> + * use of the RMRR space or even inform the IOMMU API user of the restriction.
> + * We therefore prevent devices associated with an RMRR from participating in
> + * the IOMMU API, which eliminates them from device assignment.
> + *
> + * In both cases we assume that PCI USB devices with RMRRs have them largely
> + * for historical reasons and that the RMRR space is not actively used post
> + * boot. This exclusion may change if vendors begin to abuse it.
> + */
> +static bool device_is_rmrr_locked(struct device *dev)
> +{
> + if (!device_has_rmrr(dev))
> + return false;
> +
> + if (dev_is_pci(dev)) {
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + if ((pdev->class >> 8) == PCI_CLASS_SERIAL_USB)
> + return false;
> + }
> +
> + return true;
> +}
> +
> static int iommu_should_identity_map(struct device *dev, int startup)
> {
>
> if (dev_is_pci(dev)) {
> struct pci_dev *pdev = to_pci_dev(dev);
>
> - /*
> - * We want to prevent any device associated with an RMRR from
> - * getting placed into the SI Domain. This is done because
> - * problems exist when devices are moved in and out of domains
> - * and their respective RMRR info is lost. We exempt USB devices
> - * from this process due to their usage of RMRRs that are known
> - * to not be needed after BIOS hand-off to OS.
> - */
> - if (device_has_rmrr(dev) &&
> - (pdev->class >> 8) != PCI_CLASS_SERIAL_USB)
> + if (device_is_rmrr_locked(dev))
> return 0;
>
> if ((iommu_identity_mapping & IDENTMAP_AZALIA) && IS_AZALIA(pdev))
> @@ -4171,6 +4195,11 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
> int addr_width;
> u8 bus, devfn;
>
> + if (device_is_rmrr_locked(dev)) {
> + dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement. Contact your platform vendor.\n");
> + return -EPERM;
> + }
> +
> /* normally dev is not mapped */
> if (unlikely(domain_context_mapped(dev))) {
> struct dmar_domain *old_domain;
>
> _______________________________________________
> iommu mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

2014-06-17 07:05:00

On Tue, 2014-06-17 at 09:15 +0200, Daniel Vetter wrote:
> We've always been struggling with stolen handling, and we've' always
> been struggling with vt-d stuff. Also pass-through seems to be a major
> pain (I've never tried myself). Given all that I'm voting for keeping
> the RMRR and everything else as much like for the normal case since I
> have no idea what exactly must be remapped and what's optional. The
> gpu is definitely keeping a lot of it's own private stuff in various
> chunks of stolen memory.

Keeping it like the normal case is distinctly non-trivial. I raised that
possibility, and it's hard. You have to make the guests' address maps
match the host, in that the E820-reserved regions used for DMA and
listed in RMRRs must also appear as reserved for the guests.

That was bad enough when it was just 'BIOS might be doing something evil
behind our back' and we didn't need to let the guest *access* those
pages. But in the i915 case we do actually map and access the stolen
region too, so the task is even harder. We'd need to be able to decide
when those regions should actually be mapped into the guest.

--
dwmw2

Attachments:

smime.p7s (5.61 kB)

2014-06-17 08:14:25

by Daniel Vetter

[permalink] [raw]

Subject: Re: [Intel-gfx] [PATCH v2] iommu/intel: Exclude devices using RMRRs from IOMMU API domains

On Tue, Jun 17, 2014 at 08:21:31AM +0100, David Woodhouse wrote:
> On Tue, 2014-06-17 at 09:15 +0200, Daniel Vetter wrote:
> > We've always been struggling with stolen handling, and we've' always
> > been struggling with vt-d stuff. Also pass-through seems to be a major
> > pain (I've never tried myself). Given all that I'm voting for keeping
> > the RMRR and everything else as much like for the normal case since I
> > have no idea what exactly must be remapped and what's optional. The
> > gpu is definitely keeping a lot of it's own private stuff in various
> > chunks of stolen memory.
>
> Keeping it like the normal case is distinctly non-trivial. I raised that
> possibility, and it's hard. You have to make the guests' address maps
> match the host, in that the E820-reserved regions used for DMA and
> listed in RMRRs must also appear as reserved for the guests.
>
> That was bad enough when it was just 'BIOS might be doing something evil
> behind our back' and we didn't need to let the guest *access* those
> pages. But in the i915 case we do actually map and access the stolen
> region too, so the task is even harder. We'd need to be able to decide
> when those regions should actually be mapped into the guest.

Hm, we check some registers (which iirc are set up by the bios) for stolen
to detect the address and size. And if that's there we use it. So not sure
what to do really.

For my understanding: The tricky part with RMRR isn't the mapping, but
making sure that the gues memory layout has the corresponding range
properly marked as reserved in the e820 map (i.e. like on real machines)?
I guess we wouldn't need to care about the actual memory since the host
linux can't access it either (without i915.ko).
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2014-06-17 12:23:06

by Alex Williamson

[permalink] [raw]

Subject: Re: [PATCH v2] iommu/intel: Exclude devices using RMRRs from IOMMU API domains

On Tue, 2014-06-17 at 08:04 +0100, David Woodhouse wrote:
> On Mon, 2014-06-16 at 23:35 -0600, Alex Williamson wrote:
> >
> > Any idea what an off-the-shelf Asus motherboard would be doing with an
> > RMRR on the Intel HD graphics?
> >
> > dmar: RMRR base: 0x000000bb800000 end: 0x000000bf9fffff
> > IOMMU: Setting identity map for device 0000:00:02.0 [0xbb800000 - 0xbf9fffff]
>
> Hm, we should have thought of that sooner. That's quite normal — it's
> for the 'stolen' memory used for the framebuffer. And maybe also the
> GTT, and shadow GTT and other things; I forget precisely what, and it
> varies from one setup to another.

Why exactly do these things need to be identity mapped through the
IOMMU? This sounds like something a normal device might do with a
coherent mapping.

> I'd expect fairly much all systems to have an RMRR for the integrated
> graphics device if they have one, and your patch¹ is going to prevent
> assignment of those to guests... as you've presumably noticed.
>
> I'm not sure if the i915 driver is capable of fully reprogramming the
> hardware to completely stop using that region, to allow assignment to a
> guest with a 'pure' memory map and no stolen region. I suppose it must,
> if assignment to guests was working correctly before?

IGD assignment has never worked with KVM.

> Perhaps the better answer here is not to have the special cases in
> 'device_is_rmrr_locked()', and instead allow a device driver to call a
> 'iommu_release_rmrrs()' function once it's reset the hardware to *stop*
> doing whatever DMA the BIOS set it up with.

IGD supports FLR, which is good, but I would assume an FLR doesn't
necessarily release use of this region and being a root complex device I
don't think we have a bigger hammer reset option. Thanks,

Alex

2014-06-17 12:41:18

On Thu, Jun 19, 2014 at 4:29 PM, Alex Williamson
<[email protected]> wrote:
> But is there a way for software to discover its location from the
> device? If so, then I think we can recreate all the identity maps we'd
> need for a guest from the device. If not, then we'd need to figure out
> some IOMMU API extension to handle the mapping. The spec excerpt above
> seems to indicate that hardware designers decided software doesn't need
> to know about it, but the RMRR seems to be the "oh crap" moment when
> they realized that yes we do need to know about it. Thanks,

It's all specified somewhere how it exactly works. But we've just had
piles of fun trying to get the stolen range (i.e. for gfx buffer
usage, no the gtt pte block) to work correctly and it's not been fun.
The issue is that these registers are sw-defined and set by the bios.
And the bios team occasionally smokes strong stuff and nilly-willy
changes the definitions without telling anyone ... And we know that
there's more reserved stuff in that stolen range that occasionally
shouldn't be used by the driver. We have regular discussions with
them.

Otoh the same bios teams also set up the RMRR ranges with equallly
predictable results.

I don't have a recommendation here, but expect breakage no matter what you do.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch