Between me trying to get rid of iommu_present() and Mario wanting to
support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
that the iommu_dma_protection attribute is being far too optimistic.
Even if an IOMMU might be present for some PCI segment in the system,
that doesn't necessarily mean it provides translation for the device(s)
we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
is tell us that memory was protected before the kernel was loaded, and
prevent the user from disabling the intel-iommu driver entirely. What
actually matters is whether we trust individual devices, based on the
"external facing" property that we expect firmware to describe for
Thunderbolt ports.
Avoid false positives by looking as close as possible to the same PCI
topology that the IOMMU layer will consider once a Thunderbolt endpoint
appears. Crucially, we can't assume that IOMMU translation being enabled
for any reason is sufficient on its own; full (expensive) DMA protection
will still only be imposed on untrusted devices.
CC: Mario Limonciello <[email protected]>
Signed-off-by: Robin Murphy <[email protected]>
---
This supersedes my previous attempt just trying to replace
iommu_present() at [1], further to the original discussion at [2].
[1] https://lore.kernel.org/linux-iommu/BL1PR12MB515799C0BE396377DBBEF055E2119@BL1PR12MB5157.namprd12.prod.outlook.com/T/
[2] https://lore.kernel.org/linux-iommu/[email protected]/T/
drivers/thunderbolt/domain.c | 12 +++---------
drivers/thunderbolt/nhi.c | 35 +++++++++++++++++++++++++++++++++++
include/linux/thunderbolt.h | 2 ++
3 files changed, 40 insertions(+), 9 deletions(-)
diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
index 7018d959f775..d5c825e84ac8 100644
--- a/drivers/thunderbolt/domain.c
+++ b/drivers/thunderbolt/domain.c
@@ -7,9 +7,7 @@
*/
#include <linux/device.h>
-#include <linux/dmar.h>
#include <linux/idr.h>
-#include <linux/iommu.h>
#include <linux/module.h>
#include <linux/pm_runtime.h>
#include <linux/slab.h>
@@ -257,13 +255,9 @@ static ssize_t iommu_dma_protection_show(struct device *dev,
struct device_attribute *attr,
char *buf)
{
- /*
- * Kernel DMA protection is a feature where Thunderbolt security is
- * handled natively using IOMMU. It is enabled when IOMMU is
- * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
- */
- return sprintf(buf, "%d\n",
- iommu_present(&pci_bus_type) && dmar_platform_optin());
+ struct tb *tb = container_of(dev, struct tb, dev);
+
+ return sprintf(buf, "%d\n", tb->nhi->iommu_dma_protection);
}
static DEVICE_ATTR_RO(iommu_dma_protection);
diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
index c73da0532be4..e12c2e266741 100644
--- a/drivers/thunderbolt/nhi.c
+++ b/drivers/thunderbolt/nhi.c
@@ -14,6 +14,7 @@
#include <linux/errno.h>
#include <linux/pci.h>
#include <linux/interrupt.h>
+#include <linux/iommu.h>
#include <linux/module.h>
#include <linux/delay.h>
#include <linux/property.h>
@@ -1102,6 +1103,39 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
}
+static void nhi_check_iommu(struct tb_nhi *nhi)
+{
+ struct pci_dev *pdev;
+ bool port_ok = false;
+
+ /*
+ * Check for sibling devices that look like they should be our
+ * tunnelled ports. We can reasonably assume that if an IOMMU is
+ * managing the bridge it will manage any future devices beyond it
+ * too. If firmware has described a port as external-facing as
+ * expected then we can trust the IOMMU layer to enforce isolation;
+ * otherwise even if translation is enabled for existing devices it
+ * may potentially be overridden for a future tunnelled endpoint.
+ */
+ for_each_pci_bridge(pdev, nhi->pdev->bus) {
+ if (!pci_is_pcie(pdev) ||
+ !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
+ pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
+ continue;
+
+ if (!device_iommu_mapped(&pdev->dev))
+ return;
+
+ if (!pdev->untrusted) {
+ dev_info(&nhi->pdev->dev,
+ "Assuming unreliable Kernel DMA protection\n");
+ return;
+ }
+ port_ok = true;
+ }
+ nhi->iommu_dma_protection = port_ok;
+}
+
static int nhi_init_msi(struct tb_nhi *nhi)
{
struct pci_dev *pdev = nhi->pdev;
@@ -1219,6 +1253,7 @@ static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
return -ENOMEM;
nhi_check_quirks(nhi);
+ nhi_check_iommu(nhi);
res = nhi_init_msi(nhi);
if (res) {
diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h
index 124e13cb1469..7a8ad984e651 100644
--- a/include/linux/thunderbolt.h
+++ b/include/linux/thunderbolt.h
@@ -465,6 +465,7 @@ static inline struct tb_xdomain *tb_service_parent(struct tb_service *svc)
* @msix_ida: Used to allocate MSI-X vectors for rings
* @going_away: The host controller device is about to disappear so when
* this flag is set, avoid touching the hardware anymore.
+ * @iommu_dma_protection: An IOMMU will isolate external-facing ports.
* @interrupt_work: Work scheduled to handle ring interrupt when no
* MSI-X is used.
* @hop_count: Number of rings (end point hops) supported by NHI.
@@ -479,6 +480,7 @@ struct tb_nhi {
struct tb_ring **rx_rings;
struct ida msix_ida;
bool going_away;
+ bool iommu_dma_protection;
struct work_struct interrupt_work;
u32 hop_count;
unsigned long quirks;
--
2.28.0.dirty
Hi Robin,
Thanks for working on this!
On Thu, Mar 17, 2022 at 04:17:07PM +0000, Robin Murphy wrote:
> Between me trying to get rid of iommu_present() and Mario wanting to
> support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
> that the iommu_dma_protection attribute is being far too optimistic.
> Even if an IOMMU might be present for some PCI segment in the system,
> that doesn't necessarily mean it provides translation for the device(s)
> we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
> is tell us that memory was protected before the kernel was loaded, and
> prevent the user from disabling the intel-iommu driver entirely. What
> actually matters is whether we trust individual devices, based on the
> "external facing" property that we expect firmware to describe for
> Thunderbolt ports.
We still want to know that DMAR_PLATFORM_OPT_IN is set by the firmware
because that tells us that none of the devices connected before OS got
control had the ability to perform DMA outside of the RMRR regions. If
they did then all this is pointless because they could have modified the
system memory as they wished.
> Avoid false positives by looking as close as possible to the same PCI
> topology that the IOMMU layer will consider once a Thunderbolt endpoint
> appears. Crucially, we can't assume that IOMMU translation being enabled
> for any reason is sufficient on its own; full (expensive) DMA protection
> will still only be imposed on untrusted devices.
>
> CC: Mario Limonciello <[email protected]>
> Signed-off-by: Robin Murphy <[email protected]>
> ---
>
> This supersedes my previous attempt just trying to replace
> iommu_present() at [1], further to the original discussion at [2].
>
> [1] https://lore.kernel.org/linux-iommu/BL1PR12MB515799C0BE396377DBBEF055E2119@BL1PR12MB5157.namprd12.prod.outlook.com/T/
> [2] https://lore.kernel.org/linux-iommu/[email protected]/T/
>
> drivers/thunderbolt/domain.c | 12 +++---------
> drivers/thunderbolt/nhi.c | 35 +++++++++++++++++++++++++++++++++++
> include/linux/thunderbolt.h | 2 ++
> 3 files changed, 40 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
> index 7018d959f775..d5c825e84ac8 100644
> --- a/drivers/thunderbolt/domain.c
> +++ b/drivers/thunderbolt/domain.c
> @@ -7,9 +7,7 @@
> */
>
> #include <linux/device.h>
> -#include <linux/dmar.h>
> #include <linux/idr.h>
> -#include <linux/iommu.h>
> #include <linux/module.h>
> #include <linux/pm_runtime.h>
> #include <linux/slab.h>
> @@ -257,13 +255,9 @@ static ssize_t iommu_dma_protection_show(struct device *dev,
> struct device_attribute *attr,
> char *buf)
> {
> - /*
> - * Kernel DMA protection is a feature where Thunderbolt security is
> - * handled natively using IOMMU. It is enabled when IOMMU is
> - * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
> - */
> - return sprintf(buf, "%d\n",
> - iommu_present(&pci_bus_type) && dmar_platform_optin());
> + struct tb *tb = container_of(dev, struct tb, dev);
> +
> + return sprintf(buf, "%d\n", tb->nhi->iommu_dma_protection);
> }
> static DEVICE_ATTR_RO(iommu_dma_protection);
>
> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
> index c73da0532be4..e12c2e266741 100644
> --- a/drivers/thunderbolt/nhi.c
> +++ b/drivers/thunderbolt/nhi.c
> @@ -14,6 +14,7 @@
> #include <linux/errno.h>
> #include <linux/pci.h>
> #include <linux/interrupt.h>
> +#include <linux/iommu.h>
> #include <linux/module.h>
> #include <linux/delay.h>
> #include <linux/property.h>
> @@ -1102,6 +1103,39 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
> nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
> }
>
> +static void nhi_check_iommu(struct tb_nhi *nhi)
> +{
> + struct pci_dev *pdev;
> + bool port_ok = false;
So for here somewhere we should call dmar_platform_optin() first. I
think this is something we want to move inside the IOMMU API (alongside
with the below checks).
> +
> + /*
> + * Check for sibling devices that look like they should be our
> + * tunnelled ports. We can reasonably assume that if an IOMMU is
> + * managing the bridge it will manage any future devices beyond it
> + * too. If firmware has described a port as external-facing as
> + * expected then we can trust the IOMMU layer to enforce isolation;
> + * otherwise even if translation is enabled for existing devices it
> + * may potentially be overridden for a future tunnelled endpoint.
> + */
> + for_each_pci_bridge(pdev, nhi->pdev->bus) {
> + if (!pci_is_pcie(pdev) ||
> + !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
> + pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
> + continue;
> +
> + if (!device_iommu_mapped(&pdev->dev))
> + return;
> +
> + if (!pdev->untrusted) {
> + dev_info(&nhi->pdev->dev,
> + "Assuming unreliable Kernel DMA protection\n");
> + return;
> + }
> + port_ok = true;
> + }
> + nhi->iommu_dma_protection = port_ok;
> +}
> +
> static int nhi_init_msi(struct tb_nhi *nhi)
> {
> struct pci_dev *pdev = nhi->pdev;
> @@ -1219,6 +1253,7 @@ static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> return -ENOMEM;
>
> nhi_check_quirks(nhi);
> + nhi_check_iommu(nhi);
>
> res = nhi_init_msi(nhi);
> if (res) {
> diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h
> index 124e13cb1469..7a8ad984e651 100644
> --- a/include/linux/thunderbolt.h
> +++ b/include/linux/thunderbolt.h
> @@ -465,6 +465,7 @@ static inline struct tb_xdomain *tb_service_parent(struct tb_service *svc)
> * @msix_ida: Used to allocate MSI-X vectors for rings
> * @going_away: The host controller device is about to disappear so when
> * this flag is set, avoid touching the hardware anymore.
> + * @iommu_dma_protection: An IOMMU will isolate external-facing ports.
> * @interrupt_work: Work scheduled to handle ring interrupt when no
> * MSI-X is used.
> * @hop_count: Number of rings (end point hops) supported by NHI.
> @@ -479,6 +480,7 @@ struct tb_nhi {
> struct tb_ring **rx_rings;
> struct ida msix_ida;
> bool going_away;
> + bool iommu_dma_protection;
> struct work_struct interrupt_work;
> u32 hop_count;
> unsigned long quirks;
> --
> 2.28.0.dirty
Hi Robin,
On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
> > This adds quite a lot code and complexity, and honestly I would like to
> > keep it as simple as possible (and this is not enough because we need to
> > make sure the DMAR bit is there so that none of the possible connected
> > devices were able to overwrite our memory already).
>
> Shall we forget the standalone sibling check and just make the
> pdev->untrusted check directly in tb_acpi_add_link() then?
I think we should leave tb_acpi_add_link() untouched if possible ;-)
This is because it is used to add the device links from firmware
description that we need for proper power management of the tunneled
devices. It has little to do with the identification of the external
facing DMA-capable PCIe ports.
Furthermore these links only exists in USB4 software connection manager
systems so we do not have those in the existing Thunderbolt 3/4 systems
that use firmware based connection manager (pretty much all out there).
> On reflection I guess the DMAR bit makes iommu_dma_protection
> functionally dependent on ACPI already, so we don't actually lose
> anything (and anyone can come back and revisit firmware-agnostic
> methods later if a need appears).
I agree.
On 2022-03-18 13:25, [email protected] wrote:
> Hi Robin,
>
> On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
>>> This adds quite a lot code and complexity, and honestly I would like to
>>> keep it as simple as possible (and this is not enough because we need to
>>> make sure the DMAR bit is there so that none of the possible connected
>>> devices were able to overwrite our memory already).
>>
>> Shall we forget the standalone sibling check and just make the
>> pdev->untrusted check directly in tb_acpi_add_link() then?
>
> I think we should leave tb_acpi_add_link() untouched if possible ;-)
> This is because it is used to add the device links from firmware
> description that we need for proper power management of the tunneled
> devices. It has little to do with the identification of the external
> facing DMA-capable PCIe ports.
>
> Furthermore these links only exists in USB4 software connection manager
> systems so we do not have those in the existing Thunderbolt 3/4 systems
> that use firmware based connection manager (pretty much all out there).
>
>> On reflection I guess the DMAR bit makes iommu_dma_protection
>> functionally dependent on ACPI already, so we don't actually lose
>> anything (and anyone can come back and revisit firmware-agnostic
>> methods later if a need appears).
>
> I agree.
OK, so do we have any realistic options for identifying the correct PCI
devices, if USB4 PCIe adapters might be anywhere relative to their
associated NHI? Short of maintaining a list of known IDs, the only
thought I have left is that if we walk the whole PCI segment looking
specifically for hotplug-capable Gen1 ports, any system modern enough to
have Thunderbolt is *probably* not going to have any real PCIe Gen1
hotplug slots, so maybe false negatives might be tolerable, but it still
feels like a bit of a sketchy heuristic.
I suppose we could just look to see if any device anywhere is marked as
external-facing, and hope that if firmware's done that much then it's
done everything right. That's still at least slightly better than what
we have today, but AFAICS still carries significant risk of a false
positive for an add-in card that firmware didn't recognise.
I'm satisfied that we've come round to the right conclusion on the DMAR
opt-in - I'm in the middle or writing up patches for that now - but even
Microsoft's spec gives that as a separate requirement from the flagging
of external ports, with both being necessary for Kernel DMA Protection.
Robin.
[Public]
> -----Original Message-----
> From: Robin Murphy <[email protected]>
> Sent: Friday, March 18, 2022 09:08
> To: [email protected]
> Cc: Limonciello, Mario <[email protected]>;
> [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH] thunderbolt: Make iommu_dma_protection more
> accurate
>
> On 2022-03-18 13:25, [email protected] wrote:
> > Hi Robin,
> >
> > On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
> >>> This adds quite a lot code and complexity, and honestly I would like to
> >>> keep it as simple as possible (and this is not enough because we need to
> >>> make sure the DMAR bit is there so that none of the possible connected
> >>> devices were able to overwrite our memory already).
> >>
> >> Shall we forget the standalone sibling check and just make the
> >> pdev->untrusted check directly in tb_acpi_add_link() then?
> >
> > I think we should leave tb_acpi_add_link() untouched if possible ;-)
> > This is because it is used to add the device links from firmware
> > description that we need for proper power management of the tunneled
> > devices. It has little to do with the identification of the external
> > facing DMA-capable PCIe ports.
> >
> > Furthermore these links only exists in USB4 software connection manager
> > systems so we do not have those in the existing Thunderbolt 3/4 systems
> > that use firmware based connection manager (pretty much all out there).
> >
> >> On reflection I guess the DMAR bit makes iommu_dma_protection
> >> functionally dependent on ACPI already, so we don't actually lose
> >> anything (and anyone can come back and revisit firmware-agnostic
> >> methods later if a need appears).
> >
> > I agree.
>
> OK, so do we have any realistic options for identifying the correct PCI
> devices, if USB4 PCIe adapters might be anywhere relative to their
> associated NHI? Short of maintaining a list of known IDs, the only
> thought I have left is that if we walk the whole PCI segment looking
> specifically for hotplug-capable Gen1 ports, any system modern enough to
> have Thunderbolt is *probably* not going to have any real PCIe Gen1
> hotplug slots, so maybe false negatives might be tolerable, but it still
> feels like a bit of a sketchy heuristic.
>
> I suppose we could just look to see if any device anywhere is marked as
> external-facing, and hope that if firmware's done that much then it's
> done everything right. That's still at least slightly better than what
> we have today, but AFAICS still carries significant risk of a false
> positive for an add-in card that firmware didn't recognise.
>
> I'm satisfied that we've come round to the right conclusion on the DMAR
> opt-in - I'm in the middle or writing up patches for that now - but even
> Microsoft's spec gives that as a separate requirement from the flagging
> of external ports, with both being necessary for Kernel DMA Protection.
>
> Robin.
The thunderbolt driver already has a good idea whether it's using software
CM or firmware CM. How about if we use that to make a decision?
If running firmware CM presumably that is a fixed quantity of machines that will
dwindle over time as OEMs release new HW with SW CM designs. So maybe
leave things "as is" for those - opt in bit is sufficient for this check.
And then if running software CM then make it mandatory that any created links
are set untrusted AND some pre-boot bits are set to get iommu dma protection
set. We effectively end up with the same requirements as Microsoft has in
their spec for new hardware then, and don't break any old hardware that
doesn't have the links made and relies on firmware for the CM.
On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
> OK, so do we have any realistic options for identifying the correct PCI
> devices, if USB4 PCIe adapters might be anywhere relative to their
> associated NHI? Short of maintaining a list of known IDs, the only thought I
> have left is that if we walk the whole PCI segment looking specifically for
> hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
> *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
> false negatives might be tolerable, but it still feels like a bit of a
> sketchy heuristic.
The Thunderbolt Device ROM contains the PCI slot number, so you can
correlate the Thunderbolt switch ports with PCIe downstream ports
and know exactly where PCIe tunnels are terminated.
Code is here:
* thunderbolt: Obtain PCI slot number from DROM
https://github.com/l1k/linux/commit/756f7148bc10
* thunderbolt: Move upstream_port to struct tb
https://github.com/l1k/linux/commit/58f16e7dd431
* thunderbolt: Correlate PCI devices with Thunderbolt ports
https://github.com/l1k/linux/commit/f53ea40a7487
I implemented that in 2018, so it won't apply cleanly to current
mainline. But I kept forward-porting it on my private branch and
could push that to GitHub if anyone is interested.
I don't know if this will work out-of-the-box for SoC-integrated
Thunderbolt controllers. It was developed with the discrete
controllers in mind, which was the only thing available back then.
Thanks,
Lukas
On 2022-03-18 06:44, Mika Westerberg wrote:
> Hi Robin,
>
> Thanks for working on this!
>
> On Thu, Mar 17, 2022 at 04:17:07PM +0000, Robin Murphy wrote:
>> Between me trying to get rid of iommu_present() and Mario wanting to
>> support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
>> that the iommu_dma_protection attribute is being far too optimistic.
>> Even if an IOMMU might be present for some PCI segment in the system,
>> that doesn't necessarily mean it provides translation for the device(s)
>> we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
>> is tell us that memory was protected before the kernel was loaded, and
>> prevent the user from disabling the intel-iommu driver entirely. What
>> actually matters is whether we trust individual devices, based on the
>> "external facing" property that we expect firmware to describe for
>> Thunderbolt ports.
>
> We still want to know that DMAR_PLATFORM_OPT_IN is set by the firmware
> because that tells us that none of the devices connected before OS got
> control had the ability to perform DMA outside of the RMRR regions. If
> they did then all this is pointless because they could have modified the
> system memory as they wished.
Ah, right, it does still matter in terms of whether we can trust our
*own* integrity. I was thinking that it's OK since by this point the
IOMMU driver should have reset the hardware and allocated new
configuration tables etc. such that any previous modifications wouldn't
matter, but of course the theoretical worst case is that the malicious
device manages to corrupt the kernel code in memory just before that
point to subvert the IOMMU driver itself. Got it, thanks again!
So yes, I think we *do* want to abstract that through the IOMMU API (and
I'll follow up with our ACPI folks internally to check whether anyone's
thought about this for Arm-based systems yet). Gosh, this is becoming
quite the adventure, but it's all for good in the end :)
Robin.
>> Avoid false positives by looking as close as possible to the same PCI
>> topology that the IOMMU layer will consider once a Thunderbolt endpoint
>> appears. Crucially, we can't assume that IOMMU translation being enabled
>> for any reason is sufficient on its own; full (expensive) DMA protection
>> will still only be imposed on untrusted devices.
>>
>> CC: Mario Limonciello <[email protected]>
>> Signed-off-by: Robin Murphy <[email protected]>
>> ---
>>
>> This supersedes my previous attempt just trying to replace
>> iommu_present() at [1], further to the original discussion at [2].
>>
>> [1] https://lore.kernel.org/linux-iommu/BL1PR12MB515799C0BE396377DBBEF055E2119@BL1PR12MB5157.namprd12.prod.outlook.com/T/
>> [2] https://lore.kernel.org/linux-iommu/[email protected]/T/
>>
>> drivers/thunderbolt/domain.c | 12 +++---------
>> drivers/thunderbolt/nhi.c | 35 +++++++++++++++++++++++++++++++++++
>> include/linux/thunderbolt.h | 2 ++
>> 3 files changed, 40 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
>> index 7018d959f775..d5c825e84ac8 100644
>> --- a/drivers/thunderbolt/domain.c
>> +++ b/drivers/thunderbolt/domain.c
>> @@ -7,9 +7,7 @@
>> */
>>
>> #include <linux/device.h>
>> -#include <linux/dmar.h>
>> #include <linux/idr.h>
>> -#include <linux/iommu.h>
>> #include <linux/module.h>
>> #include <linux/pm_runtime.h>
>> #include <linux/slab.h>
>> @@ -257,13 +255,9 @@ static ssize_t iommu_dma_protection_show(struct device *dev,
>> struct device_attribute *attr,
>> char *buf)
>> {
>> - /*
>> - * Kernel DMA protection is a feature where Thunderbolt security is
>> - * handled natively using IOMMU. It is enabled when IOMMU is
>> - * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
>> - */
>> - return sprintf(buf, "%d\n",
>> - iommu_present(&pci_bus_type) && dmar_platform_optin());
>> + struct tb *tb = container_of(dev, struct tb, dev);
>> +
>> + return sprintf(buf, "%d\n", tb->nhi->iommu_dma_protection);
>> }
>> static DEVICE_ATTR_RO(iommu_dma_protection);
>>
>> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
>> index c73da0532be4..e12c2e266741 100644
>> --- a/drivers/thunderbolt/nhi.c
>> +++ b/drivers/thunderbolt/nhi.c
>> @@ -14,6 +14,7 @@
>> #include <linux/errno.h>
>> #include <linux/pci.h>
>> #include <linux/interrupt.h>
>> +#include <linux/iommu.h>
>> #include <linux/module.h>
>> #include <linux/delay.h>
>> #include <linux/property.h>
>> @@ -1102,6 +1103,39 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
>> nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
>> }
>>
>> +static void nhi_check_iommu(struct tb_nhi *nhi)
>> +{
>> + struct pci_dev *pdev;
>> + bool port_ok = false;
>
>
> So for here somewhere we should call dmar_platform_optin() first. I
> think this is something we want to move inside the IOMMU API (alongside
> with the below checks).
>
>> +
>> + /*
>> + * Check for sibling devices that look like they should be our
>> + * tunnelled ports. We can reasonably assume that if an IOMMU is
>> + * managing the bridge it will manage any future devices beyond it
>> + * too. If firmware has described a port as external-facing as
>> + * expected then we can trust the IOMMU layer to enforce isolation;
>> + * otherwise even if translation is enabled for existing devices it
>> + * may potentially be overridden for a future tunnelled endpoint.
>> + */
>> + for_each_pci_bridge(pdev, nhi->pdev->bus) {
>> + if (!pci_is_pcie(pdev) ||
>> + !(pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT ||
>> + pci_pcie_type(pdev) == PCI_EXP_TYPE_DOWNSTREAM))
>> + continue;
>> +
>> + if (!device_iommu_mapped(&pdev->dev))
>> + return;
>> +
>> + if (!pdev->untrusted) {
>> + dev_info(&nhi->pdev->dev,
>> + "Assuming unreliable Kernel DMA protection\n");
>> + return;
>> + }
>> + port_ok = true;
>> + }
>> + nhi->iommu_dma_protection = port_ok;
>> +}
>> +
>> static int nhi_init_msi(struct tb_nhi *nhi)
>> {
>> struct pci_dev *pdev = nhi->pdev;
>> @@ -1219,6 +1253,7 @@ static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>> return -ENOMEM;
>>
>> nhi_check_quirks(nhi);
>> + nhi_check_iommu(nhi);
>>
>> res = nhi_init_msi(nhi);
>> if (res) {
>> diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h
>> index 124e13cb1469..7a8ad984e651 100644
>> --- a/include/linux/thunderbolt.h
>> +++ b/include/linux/thunderbolt.h
>> @@ -465,6 +465,7 @@ static inline struct tb_xdomain *tb_service_parent(struct tb_service *svc)
>> * @msix_ida: Used to allocate MSI-X vectors for rings
>> * @going_away: The host controller device is about to disappear so when
>> * this flag is set, avoid touching the hardware anymore.
>> + * @iommu_dma_protection: An IOMMU will isolate external-facing ports.
>> * @interrupt_work: Work scheduled to handle ring interrupt when no
>> * MSI-X is used.
>> * @hop_count: Number of rings (end point hops) supported by NHI.
>> @@ -479,6 +480,7 @@ struct tb_nhi {
>> struct tb_ring **rx_rings;
>> struct ida msix_ida;
>> bool going_away;
>> + bool iommu_dma_protection;
>> struct work_struct interrupt_work;
>> u32 hop_count;
>> unsigned long quirks;
>> --
>> 2.28.0.dirty
On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
> On 2022-03-18 13:25, [email protected] wrote:
> > Hi Robin,
> >
> > On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
> > > > This adds quite a lot code and complexity, and honestly I would like to
> > > > keep it as simple as possible (and this is not enough because we need to
> > > > make sure the DMAR bit is there so that none of the possible connected
> > > > devices were able to overwrite our memory already).
> > >
> > > Shall we forget the standalone sibling check and just make the
> > > pdev->untrusted check directly in tb_acpi_add_link() then?
> >
> > I think we should leave tb_acpi_add_link() untouched if possible ;-)
> > This is because it is used to add the device links from firmware
> > description that we need for proper power management of the tunneled
> > devices. It has little to do with the identification of the external
> > facing DMA-capable PCIe ports.
> >
> > Furthermore these links only exists in USB4 software connection manager
> > systems so we do not have those in the existing Thunderbolt 3/4 systems
> > that use firmware based connection manager (pretty much all out there).
> >
> > > On reflection I guess the DMAR bit makes iommu_dma_protection
> > > functionally dependent on ACPI already, so we don't actually lose
> > > anything (and anyone can come back and revisit firmware-agnostic
> > > methods later if a need appears).
> >
> > I agree.
>
> OK, so do we have any realistic options for identifying the correct PCI
> devices, if USB4 PCIe adapters might be anywhere relative to their
> associated NHI? Short of maintaining a list of known IDs, the only thought I
> have left is that if we walk the whole PCI segment looking specifically for
> hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
> *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
> false negatives might be tolerable, but it still feels like a bit of a
> sketchy heuristic.
Indeed.
> I suppose we could just look to see if any device anywhere is marked as
> external-facing, and hope that if firmware's done that much then it's done
> everything right. That's still at least slightly better than what we have
> today, but AFAICS still carries significant risk of a false positive for an
> add-in card that firmware didn't recognise.
The port in this case, that is marked as external facing, is the PCIe
root port that the add-in-card is connected to and that is known for the
firmware in advance.
> I'm satisfied that we've come round to the right conclusion on the DMAR
> opt-in - I'm in the middle or writing up patches for that now - but even
> Microsoft's spec gives that as a separate requirement from the flagging of
> external ports, with both being necessary for Kernel DMA Protection.
Is the problem that we are here trying to solve the fact that user can
disable the IOMMU protection from the command line? Or the fact that the
firmware might not declare all the ports properly so we may end up in a
situation that some of the ports do not get the full IOMMU protection.
These are Microsoft requirements for the OEMs in order to pass their
firmware test suite so here I would not expect to have issues. Otherwise
they simply cannot ship the thing with Windows installed.
IMHO we should just trust the firmare provided information here
(otherwise we are screwed anyway as there is no way to tell if the
devices connected prior the OS can still do DMA), and use the external
facing port indicator to idenfity the ports that need DMA protection.
Hi Robin,
On Fri, Mar 18, 2022 at 03:15:19PM +0000, Robin Murphy wrote:
> > IMHO we should just trust the firmare provided information here
> > (otherwise we are screwed anyway as there is no way to tell if the
> > devices connected prior the OS can still do DMA), and use the external
> > facing port indicator to idenfity the ports that need DMA protection.
>
> Indeed that's exactly what I want to do, but it begs the question of how we
> *find* the firmware-provided information in the first place!
Oh, right :) Its the combination of ACPI _DSD "ExternalFacingPort"
(which we already set, dev->external_facing, dev->untrusted for the
devices behind these ports IIRC) and the DMAR opt-in bit. All these are
already read by the kernel.
> I seem to have already started writing the dumb version that will walk the
> whole PCI segment and assume the presence of any external-facing port
> implies that we're good. Let me know if I should stop ;)
That sounds good to me, so don't stop just yet ;-)
On 2022-03-18 14:47, [email protected] wrote:
> On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
>> On 2022-03-18 13:25, [email protected] wrote:
>>> Hi Robin,
>>>
>>> On Fri, Mar 18, 2022 at 12:01:42PM +0000, Robin Murphy wrote:
>>>>> This adds quite a lot code and complexity, and honestly I would like to
>>>>> keep it as simple as possible (and this is not enough because we need to
>>>>> make sure the DMAR bit is there so that none of the possible connected
>>>>> devices were able to overwrite our memory already).
>>>>
>>>> Shall we forget the standalone sibling check and just make the
>>>> pdev->untrusted check directly in tb_acpi_add_link() then?
>>>
>>> I think we should leave tb_acpi_add_link() untouched if possible ;-)
>>> This is because it is used to add the device links from firmware
>>> description that we need for proper power management of the tunneled
>>> devices. It has little to do with the identification of the external
>>> facing DMA-capable PCIe ports.
>>>
>>> Furthermore these links only exists in USB4 software connection manager
>>> systems so we do not have those in the existing Thunderbolt 3/4 systems
>>> that use firmware based connection manager (pretty much all out there).
>>>
>>>> On reflection I guess the DMAR bit makes iommu_dma_protection
>>>> functionally dependent on ACPI already, so we don't actually lose
>>>> anything (and anyone can come back and revisit firmware-agnostic
>>>> methods later if a need appears).
>>>
>>> I agree.
>>
>> OK, so do we have any realistic options for identifying the correct PCI
>> devices, if USB4 PCIe adapters might be anywhere relative to their
>> associated NHI? Short of maintaining a list of known IDs, the only thought I
>> have left is that if we walk the whole PCI segment looking specifically for
>> hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
>> *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
>> false negatives might be tolerable, but it still feels like a bit of a
>> sketchy heuristic.
>
> Indeed.
>
>> I suppose we could just look to see if any device anywhere is marked as
>> external-facing, and hope that if firmware's done that much then it's done
>> everything right. That's still at least slightly better than what we have
>> today, but AFAICS still carries significant risk of a false positive for an
>> add-in card that firmware didn't recognise.
>
> The port in this case, that is marked as external facing, is the PCIe
> root port that the add-in-card is connected to and that is known for the
> firmware in advance.
>
>> I'm satisfied that we've come round to the right conclusion on the DMAR
>> opt-in - I'm in the middle or writing up patches for that now - but even
>> Microsoft's spec gives that as a separate requirement from the flagging of
>> external ports, with both being necessary for Kernel DMA Protection.
>
> Is the problem that we are here trying to solve the fact that user can
> disable the IOMMU protection from the command line? Or the fact that the
> firmware might not declare all the ports properly so we may end up in a
> situation that some of the ports do not get the full IOMMU protection.
It's about knowing whether or not firmware has declared the ports at
all. If it hasn't then the system is vulnerable to *some* degree of DMA
attacks regardless of anything else (the exact degree depending on
kernel config and user overrides). Complete mitigation is simply too
expensive to apply by default to every device the IOMMU layer is unsure
about. The Thunderbolt driver cannot be confident that protection is in
place unless it can somehow know that the IOMMU layer has seen that
untrusted property on the relevant ports.
> These are Microsoft requirements for the OEMs in order to pass their
> firmware test suite so here I would not expect to have issues. Otherwise
> they simply cannot ship the thing with Windows installed.
>
> IMHO we should just trust the firmare provided information here
> (otherwise we are screwed anyway as there is no way to tell if the
> devices connected prior the OS can still do DMA), and use the external
> facing port indicator to idenfity the ports that need DMA protection.
Indeed that's exactly what I want to do, but it begs the question of how
we *find* the firmware-provided information in the first place!
I seem to have already started writing the dumb version that will walk
the whole PCI segment and assume the presence of any external-facing
port implies that we're good. Let me know if I should stop ;)
Cheers,
Robin.
Hi Lukas,
On Fri, Mar 18, 2022 at 03:51:21PM +0100, Lukas Wunner wrote:
> On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
> > OK, so do we have any realistic options for identifying the correct PCI
> > devices, if USB4 PCIe adapters might be anywhere relative to their
> > associated NHI? Short of maintaining a list of known IDs, the only thought I
> > have left is that if we walk the whole PCI segment looking specifically for
> > hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
> > *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
> > false negatives might be tolerable, but it still feels like a bit of a
> > sketchy heuristic.
>
> The Thunderbolt Device ROM contains the PCI slot number, so you can
> correlate the Thunderbolt switch ports with PCIe downstream ports
> and know exactly where PCIe tunnels are terminated.
>
> Code is here:
> * thunderbolt: Obtain PCI slot number from DROM
> https://github.com/l1k/linux/commit/756f7148bc10
> * thunderbolt: Move upstream_port to struct tb
> https://github.com/l1k/linux/commit/58f16e7dd431
> * thunderbolt: Correlate PCI devices with Thunderbolt ports
> https://github.com/l1k/linux/commit/f53ea40a7487
>
> I implemented that in 2018, so it won't apply cleanly to current
> mainline. But I kept forward-porting it on my private branch and
> could push that to GitHub if anyone is interested.
>
> I don't know if this will work out-of-the-box for SoC-integrated
> Thunderbolt controllers. It was developed with the discrete
> controllers in mind, which was the only thing available back then.
That DROM entry is completely optional and so is the whole DROM for the
host routers (this is the root of the USB4/TBT topology) so
unfortunately we cannot use it here.
On Fri, Mar 18, 2022 at 03:51:21PM +0100, Lukas Wunner wrote:
> On Fri, Mar 18, 2022 at 02:08:16PM +0000, Robin Murphy wrote:
> > OK, so do we have any realistic options for identifying the correct PCI
> > devices, if USB4 PCIe adapters might be anywhere relative to their
> > associated NHI? Short of maintaining a list of known IDs, the only thought I
> > have left is that if we walk the whole PCI segment looking specifically for
> > hotplug-capable Gen1 ports, any system modern enough to have Thunderbolt is
> > *probably* not going to have any real PCIe Gen1 hotplug slots, so maybe
> > false negatives might be tolerable, but it still feels like a bit of a
> > sketchy heuristic.
>
> The Thunderbolt Device ROM contains the PCI slot number, so you can
> correlate the Thunderbolt switch ports with PCIe downstream ports
> and know exactly where PCIe tunnels are terminated.
[...]
> I implemented that in 2018, so it won't apply cleanly to current
> mainline. But I kept forward-porting it on my private branch and
> could push that to GitHub if anyone is interested.
FWIW, here's the most recent forward-port I've done:
https://github.com/l1k/linux/commits/thunderbolt_correlate_5.13