2012-02-08 15:28:02

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 1/3] Device isolation group infrastructure (v3)

On Wed, Feb 01, 2012 at 03:46:52PM +1100, David Gibson wrote:
> In order to safely drive a device with a userspace driver, or to pass
> it through to a guest system, we must first make sure that the device
> is isolated in such a way that it cannot interfere with other devices
> on the system. This isolation is only available on some systems and
> will generally require an iommu, and might require other support in
> bridges or other system hardware.
>
> Often, it's not possible to isolate every device from every other
> device in the system. For example, certain PCI/PCIe bridge
> configurations mean that an iommu cannot reliably distinguish which
> device behind the bridge initiated a DMA transaction. Similarly some
> buggy PCI multifunction devices initiate all DMAs as function 0, so
> the functions cannot be isolated from each other, even if the IOMMU
> normally allows this.
>
> Therefore, the user, and code to allow userspace drivers or guest
> passthrough, needs a way to determine which devices can be isolated
> from which others. This patch adds infrastructure to handle this by
> introducing the concept of a "device isolation group" - a group of
> devices which can, as a unit, be safely isolated from the rest of the
> system and therefore can be, as a unit, safely assigned to an
> unprivileged used or guest. That is, the groups represent the minimum
> granularity with which devices may be assigned to untrusted
> components.
>
> This code manages groups, but does not create them or allow use of
> grouped devices by a guest. Creating groups would be done by iommu or
> bridge drivers, using the interface this patch provides. It's
> expected that the groups will be used in future by the in-kernel iommu
> interface, and would also be used by VFIO or other subsystems to allow
> safe passthrough of devices to userspace or guests.
>
> Signed-off-by: Alexey Kardashevskiy <[email protected]>
> Signed-off-by: David Gibson <[email protected]>
> ---
> drivers/base/Kconfig | 3 +
> drivers/base/Makefile | 1 +
> drivers/base/base.h | 3 +
> drivers/base/core.c | 6 ++
> drivers/base/device_isolation.c | 184 ++++++++++++++++++++++++++++++++++++++
> drivers/base/init.c | 2 +
> include/linux/device.h | 5 +
> include/linux/device_isolation.h | 100 +++++++++++++++++++++

Again, device grouping is done by the IOMMU drivers, so this all belongs
into the generic iommu-code rather than the driver core.

I think it makes sense to introduce a device->iommu pointer which
depends on CONFIG_IOMMU_API and put the group information into it.
This also has the benefit that we can consolidate all the
device->arch.iommu pointers into device->iommu as well.


> 8 files changed, 304 insertions(+), 0 deletions(-)
> create mode 100644 drivers/base/device_isolation.c
> create mode 100644 include/linux/device_isolation.h
>
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 7be9f79..a52f2db 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -189,4 +189,7 @@ config DMA_SHARED_BUFFER
> APIs extension; the file's descriptor can then be passed on to other
> driver.
>
> +config DEVICE_ISOLATION
> + bool "Enable isolating devices for safe pass-through to guests or user space."
> +

No need for a config option. When IOMMU drivers are enabled we also want
the group code to be active.


Joerg

--
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


2012-02-08 21:40:06

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH 1/3] Device isolation group infrastructure (v3)

On Wed, 2012-02-08 at 16:27 +0100, Joerg Roedel wrote:
> Again, device grouping is done by the IOMMU drivers, so this all
> belongs
> into the generic iommu-code rather than the driver core.

Except that there isn't really a "generic iommu code"... discovery,
initialization & matching of iommu vs. devices etc... that's all
implemented in the arch specific iommu code.

Cheers,
Ben.

> I think it makes sense to introduce a device->iommu pointer which
> depends on CONFIG_IOMMU_API and put the group information into it.
> This also has the benefit that we can consolidate all the
> device->arch.iommu pointers into device->iommu as well.
>
>

2012-02-08 21:43:47

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH 1/3] Device isolation group infrastructure (v3)

On Wed, 2012-02-08 at 16:27 +0100, Joerg Roedel wrote:
> Again, device grouping is done by the IOMMU drivers, so this all
> belongs
> into the generic iommu-code rather than the driver core.
>
> I think it makes sense to introduce a device->iommu pointer which
> depends on CONFIG_IOMMU_API and put the group information into it.
> This also has the benefit that we can consolidate all the
> device->arch.iommu pointers into device->iommu as well.

... and I pressed sent too quickly before.

So not only that, but these patches are simply a mechanism to expose
those groups to userspace and allow ownership (ie synchronize with the
attachment/detachment of kernel drivers).

So this code totally belongs in the driver core.

It does -not- address the issue of deciding how the groups are formed,
for this, it expects the information to be provided by the arch iommu
layer and we'll have to work on that.

The way iommu grouping work is too dependent on a given HW setup, you
can't really do that generically.

Yes, some factors are going to be common, such as the already mentioned
ricoh chip, but I think the best we can do here is provide quirks for
the iommu code to use.

There are capacity limits on how bdfn filtering works on bridges, either
CAM size or simply how it is arranged (ie on power for example, I can
chose to identify a function, a device, or a range of bus numbers but in
the later case it has to be an aligned power of two up to 32), etc...

I wouldn't try to solve that generically just yet.

Cheers,
Ben.

2012-02-09 03:34:47

by David Gibson

[permalink] [raw]
Subject: Re: [PATCH 1/3] Device isolation group infrastructure (v3)

On Wed, Feb 08, 2012 at 04:27:48PM +0100, Joerg Roedel wrote:
> On Wed, Feb 01, 2012 at 03:46:52PM +1100, David Gibson wrote:
> > In order to safely drive a device with a userspace driver, or to pass
> > it through to a guest system, we must first make sure that the device
> > is isolated in such a way that it cannot interfere with other devices
> > on the system. This isolation is only available on some systems and
> > will generally require an iommu, and might require other support in
> > bridges or other system hardware.
> >
> > Often, it's not possible to isolate every device from every other
> > device in the system. For example, certain PCI/PCIe bridge
> > configurations mean that an iommu cannot reliably distinguish which
> > device behind the bridge initiated a DMA transaction. Similarly some
> > buggy PCI multifunction devices initiate all DMAs as function 0, so
> > the functions cannot be isolated from each other, even if the IOMMU
> > normally allows this.
> >
> > Therefore, the user, and code to allow userspace drivers or guest
> > passthrough, needs a way to determine which devices can be isolated
> > from which others. This patch adds infrastructure to handle this by
> > introducing the concept of a "device isolation group" - a group of
> > devices which can, as a unit, be safely isolated from the rest of the
> > system and therefore can be, as a unit, safely assigned to an
> > unprivileged used or guest. That is, the groups represent the minimum
> > granularity with which devices may be assigned to untrusted
> > components.
> >
> > This code manages groups, but does not create them or allow use of
> > grouped devices by a guest. Creating groups would be done by iommu or
> > bridge drivers, using the interface this patch provides. It's
> > expected that the groups will be used in future by the in-kernel iommu
> > interface, and would also be used by VFIO or other subsystems to allow
> > safe passthrough of devices to userspace or guests.
> >
> > Signed-off-by: Alexey Kardashevskiy <[email protected]>
> > Signed-off-by: David Gibson <[email protected]>
> > ---
> > drivers/base/Kconfig | 3 +
> > drivers/base/Makefile | 1 +
> > drivers/base/base.h | 3 +
> > drivers/base/core.c | 6 ++
> > drivers/base/device_isolation.c | 184 ++++++++++++++++++++++++++++++++++++++
> > drivers/base/init.c | 2 +
> > include/linux/device.h | 5 +
> > include/linux/device_isolation.h | 100 +++++++++++++++++++++
>
> Again, device grouping is done by the IOMMU drivers, so this all belongs
> into the generic iommu-code rather than the driver core.
>
> I think it makes sense to introduce a device->iommu pointer which
> depends on CONFIG_IOMMU_API and put the group information into it.
> This also has the benefit that we can consolidate all the
> device->arch.iommu pointers into device->iommu as well.

Well, not quite. In the two example setups in the subsequent patches
the grouping is done by the bridge driver, which in these cases is not
IOMMU_API aware. They probably should become so, but that's another
project - and relies on the IOMMU_API becoming group aware.

Note that although iommus are the main source of group constraints,
they're not necessarily the only one. Bridge error isolation semantics
may also play a part, for one.

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson

2012-02-09 11:28:15

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 1/3] Device isolation group infrastructure (v3)

On Thu, Feb 09, 2012 at 08:39:28AM +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2012-02-08 at 16:27 +0100, Joerg Roedel wrote:
> > Again, device grouping is done by the IOMMU drivers, so this all
> > belongs
> > into the generic iommu-code rather than the driver core.
>
> Except that there isn't really a "generic iommu code"... discovery,
> initialization & matching of iommu vs. devices etc... that's all
> implemented in the arch specific iommu code.

The whole point of moving the iommu drivers to drivers/iommu was to
factor out common code. We are not where we want to be yet but the goal
is to move more code to the generic part.

For the group-code this means that the generic code should iterate over
all devices on a bus and build up group structures based on isolation
information provided by the arch specific code.


Joerg

--
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

2012-02-10 00:21:22

by David Gibson

[permalink] [raw]
Subject: Re: [PATCH 1/3] Device isolation group infrastructure (v3)

On Thu, Feb 09, 2012 at 12:28:05PM +0100, Joerg Roedel wrote:
> On Thu, Feb 09, 2012 at 08:39:28AM +1100, Benjamin Herrenschmidt wrote:
> > On Wed, 2012-02-08 at 16:27 +0100, Joerg Roedel wrote:
> > > Again, device grouping is done by the IOMMU drivers, so this all
> > > belongs
> > > into the generic iommu-code rather than the driver core.
> >
> > Except that there isn't really a "generic iommu code"... discovery,
> > initialization & matching of iommu vs. devices etc... that's all
> > implemented in the arch specific iommu code.
>
> The whole point of moving the iommu drivers to drivers/iommu was to
> factor out common code. We are not where we want to be yet but the goal
> is to move more code to the generic part.
>
> For the group-code this means that the generic code should iterate over
> all devices on a bus and build up group structures based on isolation
> information provided by the arch specific code.

And how exactly do you suggest it provide that information. I really
can't see how an iommu driver would specify its isolation constraints
generally enough, except in the form of code and then we're back to
where we are now.

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson