2020-06-30 04:50:25

by Rajat Jain

[permalink] [raw]
Subject: [PATCH v2 0/7] Tighten PCI security, expose dev location in sysfs

This is a set of loosely related patches most of whom emerged out of
discussion in the following threads. In a nutshell the goal was to allow
an administrator to specify which driver he wants to allow on external
ports, and a strategy was chalked out:
https://lore.kernel.org/linux-pci/20200609210400.GA1461839@bjorn-Precision-5520/
https://lore.kernel.org/linux-pci/[email protected]/
https://lore.kernel.org/linux-pci/[email protected]/

* The first 3 patches tighten the PCI security using ACS, and take care
of a border case.
* The 4th patch takes care of PCI bug.
* 5th and 6th patches expose a device's location into the sysfs to allow
admin to make decision based on that.
* 7th patch is to ensure that the external devices don't bind to drivers
during boot.

Rajat Jain (7):
PCI: Keep the ACS capability offset in device
PCI: Set "untrusted" flag for truly external devices only
PCI/ACS: Enable PCI_ACS_TB for untrusted/external-facing devices
PCI: Add device even if driver attach failed
driver core: Add device location to "struct device" and expose it in
sysfs
PCI: Move pci_dev->untrusted logic to use device location instead
PCI: Add parameter to disable attaching external devices

drivers/base/core.c | 35 +++++++++++++++++++++++++++++++
drivers/iommu/intel/iommu.c | 31 ++++++++++++++++++---------
drivers/pci/ats.c | 2 +-
drivers/pci/bus.c | 13 ++++++------
drivers/pci/of.c | 2 +-
drivers/pci/p2pdma.c | 2 +-
drivers/pci/pci-acpi.c | 13 ++++++------
drivers/pci/pci-driver.c | 1 +
drivers/pci/pci.c | 34 ++++++++++++++++++++++++++----
drivers/pci/pci.h | 3 ++-
drivers/pci/probe.c | 20 +++++++++++-------
drivers/pci/quirks.c | 19 +++++++++++++----
include/linux/device.h | 42 +++++++++++++++++++++++++++++++++++++
include/linux/device/bus.h | 8 +++++++
include/linux/pci.h | 13 ++++++------
15 files changed, 191 insertions(+), 47 deletions(-)

--
2.27.0.212.ge8ba1cc988-goog


2020-06-30 04:50:44

by Rajat Jain

[permalink] [raw]
Subject: [PATCH v2 6/7] PCI: Move pci_dev->untrusted logic to use device location instead

The firmware was provinding "ExternalFacing" attribute on PCI root ports,
to allow the kernel to mark devices behind it as external. Note that the
firmware provides an immutable, read-only property, i.e. the location of
the device.

The use of (external) device location as hint for (dis)trust, is a
decision that IOMMU drivers have taken, so we should call it out
explicitly.

This patch removes the pci_dev->untrusted and changes the users of it to
use device core provided device location instead. This location is
populated by PCI using the same "ExternalFacing" firmware info. Any
device not behind the "ExternalFacing" bridges are marked internal and
the ones behind such bridges are markes external.

Signed-off-by: Rajat Jain <[email protected]>
---
v2: (Initial version)

drivers/iommu/intel/iommu.c | 31 +++++++++++++++++++++----------
drivers/pci/ats.c | 2 +-
drivers/pci/pci-driver.c | 1 +
drivers/pci/pci.c | 2 +-
drivers/pci/probe.c | 18 ++++++++++++------
drivers/pci/quirks.c | 2 +-
include/linux/pci.h | 10 +---------
7 files changed, 38 insertions(+), 28 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 1ccb224f82496..ca66a196f5e97 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -168,6 +168,22 @@ static inline unsigned long virt_to_dma_pfn(void *p)
return page_to_dma_pfn(virt_to_page(p));
}

+static inline bool untrusted_dev(struct device *dev)
+{
+ /*
+ * Treat all external PCI devices as untrusted devices. These are the
+ * devices behing marked behind external-facing bridges as marked by
+ * the firmware. The untrusted devices are the ones that can potentially
+ * execute DMA attacks and similar. They are typically connected through
+ * external thunderbolt ports. When an IOMMU is enabled they should be
+ * getting full mappings to ensure they cannot access arbitrary memory.
+ */
+ if (dev_is_pci(dev) && dev_is_external(dev))
+ return true;
+
+ return false;
+}
+
/* global iommu list, set NULL for ignored DMAR units */
static struct intel_iommu **g_iommus;

@@ -383,8 +399,7 @@ struct device_domain_info *get_domain_info(struct device *dev)
DEFINE_SPINLOCK(device_domain_lock);
static LIST_HEAD(device_domain_list);

-#define device_needs_bounce(d) (!intel_no_bounce && dev_is_pci(d) && \
- to_pci_dev(d)->untrusted)
+#define device_needs_bounce(d) (!intel_no_bounce && untrusted_dev(d))

/*
* Iterate over elements in device_domain_list and call the specified
@@ -2830,7 +2845,7 @@ static int device_def_domain_type(struct device *dev)
* Prevent any device marked as untrusted from getting
* placed into the statically identity mapping domain.
*/
- if (pdev->untrusted)
+ if (untrusted_dev(dev))
return IOMMU_DOMAIN_DMA;

if ((iommu_identity_mapping & IDENTMAP_AZALIA) && IS_AZALIA(pdev))
@@ -3464,7 +3479,6 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
unsigned long iova_pfn;
struct intel_iommu *iommu;
struct page *freelist;
- struct pci_dev *pdev = NULL;

domain = find_domain(dev);
BUG_ON(!domain);
@@ -3477,11 +3491,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
start_pfn = mm_to_dma_pfn(iova_pfn);
last_pfn = start_pfn + nrpages - 1;

- if (dev_is_pci(dev))
- pdev = to_pci_dev(dev);
-
freelist = domain_unmap(domain, start_pfn, last_pfn);
- if (intel_iommu_strict || (pdev && pdev->untrusted) ||
+ if (intel_iommu_strict || untrusted_dev(dev) ||
!has_iova_flush_queue(&domain->iovad)) {
iommu_flush_iotlb_psi(iommu, domain, start_pfn,
nrpages, !freelist, 0);
@@ -4743,7 +4754,7 @@ static inline bool has_untrusted_dev(void)
struct pci_dev *pdev = NULL;

for_each_pci_dev(pdev)
- if (pdev->untrusted || pdev->external_facing)
+ if (pdev->external_facing || untrusted_dev(&pdev->dev))
return true;

return false;
@@ -6036,7 +6047,7 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
*/
static bool risky_device(struct pci_dev *pdev)
{
- if (pdev->untrusted) {
+ if (untrusted_dev(&pdev->dev)) {
pci_info(pdev,
"Skipping IOMMU quirk for dev [%04X:%04X] on untrusted PCI link\n",
pdev->vendor, pdev->device);
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index b761c1f72f672..ebd370f4d5b06 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -42,7 +42,7 @@ bool pci_ats_supported(struct pci_dev *dev)
if (!dev->ats_cap)
return false;

- return (dev->untrusted == 0);
+ return (!dev_is_external(&dev->dev));
}
EXPORT_SYMBOL_GPL(pci_ats_supported);

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index da6510af12214..9608053a8a62c 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1630,6 +1630,7 @@ struct bus_type pci_bus_type = {
.pm = PCI_PM_OPS_PTR,
.num_vf = pci_bus_num_vf,
.dma_configure = pci_dma_configure,
+ .supports_site = true,
};
EXPORT_SYMBOL(pci_bus_type);

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 79853b52658a2..35f25ac39167b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3330,7 +3330,7 @@ static void pci_std_enable_acs(struct pci_dev *dev)
/* Upstream Forwarding */
ctrl |= (cap & PCI_ACS_UF);

- if (dev->external_facing || dev->untrusted)
+ if (dev->external_facing || dev_is_external(&dev->dev))
/* Translation Blocking */
ctrl |= (cap & PCI_ACS_TB);

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 8c40c00413e74..1609329cc5b4e 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1543,17 +1543,23 @@ static void set_pcie_thunderbolt(struct pci_dev *dev)
}
}

-static void set_pcie_untrusted(struct pci_dev *dev)
+static void set_pcie_dev_site(struct pci_dev *dev)
{
struct pci_dev *parent;

/*
- * If the upstream bridge is untrusted we treat this device
- * untrusted as well.
+ * All devices are considered internal by default, unless behind an
+ * external-facing bridge, as marked by the firmware.
+ */
+ dev_set_site(&dev->dev, SITE_INTERNAL);
+
+ /*
+ * If the upstream bridge is external or external-facing, this device
+ * is also external.
*/
parent = pci_upstream_bridge(dev);
- if (parent && (parent->untrusted || parent->external_facing))
- dev->untrusted = true;
+ if (parent && (parent->external_facing || dev_is_external(&parent->dev)))
+ dev_set_site(&dev->dev, SITE_EXTERNAL);
}

/**
@@ -1814,7 +1820,7 @@ int pci_setup_device(struct pci_dev *dev)
/* Need to have dev->cfg_size ready */
set_pcie_thunderbolt(dev);

- set_pcie_untrusted(dev);
+ set_pcie_dev_site(dev);

/* "Unknown power state" */
dev->current_state = PCI_UNKNOWN;
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6294adeac4049..65d0b8745c915 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4980,7 +4980,7 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
ctrl |= (cap & PCI_ACS_CR);
ctrl |= (cap & PCI_ACS_UF);

- if (dev->external_facing || dev->untrusted)
+ if (dev->external_facing || dev_is_external(&dev->dev))
/* Translation Blocking */
ctrl |= (cap & PCI_ACS_TB);

diff --git a/include/linux/pci.h b/include/linux/pci.h
index fe1bc603fda40..8bb5065e5aed2 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -424,20 +424,12 @@ struct pci_dev {
unsigned int is_hotplug_bridge:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
unsigned int is_thunderbolt:1; /* Thunderbolt controller */
- /*
- * Devices marked being untrusted are the ones that can potentially
- * execute DMA attacks and similar. They are typically connected
- * through external ports such as Thunderbolt but not limited to
- * that. When an IOMMU is enabled they should be getting full
- * mappings to make sure they cannot access arbitrary memory.
- */
- unsigned int untrusted:1;
/*
* Devices are marked as external-facing using info from platform
* (ACPI / devicetree). An external-facing device is still an internal
* trusted device, but it faces external untrusted devices. Thus any
* devices enumerated downstream an external-facing device is marked
- * as untrusted.
+ * as external device.
*/
unsigned int external_facing:1;
unsigned int broken_intx_masking:1; /* INTx masking can't be used */
--
2.27.0.212.ge8ba1cc988-goog

2020-06-30 04:50:49

by Rajat Jain

[permalink] [raw]
Subject: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

Add a new (optional) field to denote the physical location of a device
in the system, and expose it in sysfs. This was discussed here:
https://lore.kernel.org/linux-acpi/[email protected]/

(The primary choice for attribute name i.e. "location" is already
exposed as an ABI elsewhere, so settled for "site"). Individual buses
that want to support this new attribute can opt-in by setting a flag in
bus_type, and then populating the location of device while enumerating
it.

Signed-off-by: Rajat Jain <[email protected]>
---
v2: (Initial version)

drivers/base/core.c | 35 +++++++++++++++++++++++++++++++
include/linux/device.h | 42 ++++++++++++++++++++++++++++++++++++++
include/linux/device/bus.h | 8 ++++++++
3 files changed, 85 insertions(+)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 67d39a90b45c7..14c815526b7fa 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1778,6 +1778,32 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr,
}
static DEVICE_ATTR_RW(online);

+static ssize_t site_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ const char *site;
+
+ device_lock(dev);
+ switch (dev->site) {
+ case SITE_INTERNAL:
+ site = "INTERNAL";
+ break;
+ case SITE_EXTENDED:
+ site = "EXTENDED";
+ break;
+ case SITE_EXTERNAL:
+ site = "EXTERNAL";
+ break;
+ case SITE_UNKNOWN:
+ default:
+ site = "UNKNOWN";
+ break;
+ }
+ device_unlock(dev);
+ return sprintf(buf, "%s\n", site);
+}
+static DEVICE_ATTR_RO(site);
+
int device_add_groups(struct device *dev, const struct attribute_group **groups)
{
return sysfs_create_groups(&dev->kobj, groups);
@@ -1949,8 +1975,16 @@ static int device_add_attrs(struct device *dev)
goto err_remove_dev_groups;
}

+ if (bus_supports_site(dev->bus)) {
+ error = device_create_file(dev, &dev_attr_site);
+ if (error)
+ goto err_remove_dev_attr_online;
+ }
+
return 0;

+ err_remove_dev_attr_online:
+ device_remove_file(dev, &dev_attr_online);
err_remove_dev_groups:
device_remove_groups(dev, dev->groups);
err_remove_type_groups:
@@ -1968,6 +2002,7 @@ static void device_remove_attrs(struct device *dev)
struct class *class = dev->class;
const struct device_type *type = dev->type;

+ device_remove_file(dev, &dev_attr_site);
device_remove_file(dev, &dev_attr_online);
device_remove_groups(dev, dev->groups);

diff --git a/include/linux/device.h b/include/linux/device.h
index 15460a5ac024a..a4143735ae712 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -428,6 +428,31 @@ enum dl_dev_state {
DL_DEV_UNBINDING,
};

+/**
+ * enum device_site - Physical location of the device in the system.
+ * The semantics of values depend on subsystem / bus:
+ *
+ * @SITE_UNKNOWN: Location is Unknown (default)
+ *
+ * @SITE_INTERNAL: Device is internal to the system, and cannot be (easily)
+ * removed. E.g. SoC internal devices, onboard soldered
+ * devices, internal M.2 cards (that cannot be removed
+ * without opening the chassis).
+ * @SITE_EXTENDED: Device sits an extension of the system. E.g. devices
+ * on external PCIe trays, docking stations etc. These
+ * devices may be removable, but are generally housed
+ * internally on an extension board, so they are removed
+ * only when that whole extension board is removed.
+ * @SITE_EXTERNAL: Devices truly external to the system (i.e. plugged on
+ * an external port) that may be removed or added frequently.
+ */
+enum device_site {
+ SITE_UNKNOWN = 0,
+ SITE_INTERNAL,
+ SITE_EXTENDED,
+ SITE_EXTERNAL,
+};
+
/**
* struct dev_links_info - Device data related to device links.
* @suppliers: List of links to supplier devices.
@@ -513,6 +538,7 @@ struct dev_links_info {
* device (i.e. the bus driver that discovered the device).
* @iommu_group: IOMMU group the device belongs to.
* @iommu: Per device generic IOMMU runtime data
+ * @site: Physical location of the device w.r.t. the system
*
* @offline_disabled: If set, the device is permanently online.
* @offline: Set after successful invocation of bus type's .offline().
@@ -613,6 +639,8 @@ struct device {
struct iommu_group *iommu_group;
struct dev_iommu *iommu;

+ enum device_site site; /* Device physical location */
+
bool offline_disabled:1;
bool offline:1;
bool of_node_reused:1;
@@ -806,6 +834,20 @@ static inline bool dev_has_sync_state(struct device *dev)
return false;
}

+static inline int dev_set_site(struct device *dev, enum device_site site)
+{
+ if (site < SITE_UNKNOWN || site > SITE_EXTERNAL)
+ return -EINVAL;
+
+ dev->site = site;
+ return 0;
+}
+
+static inline bool dev_is_external(struct device *dev)
+{
+ return dev->site == SITE_EXTERNAL;
+}
+
/*
* High level routines for use by the bus drivers
*/
diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h
index 1ea5e1d1545bd..e1079772e45af 100644
--- a/include/linux/device/bus.h
+++ b/include/linux/device/bus.h
@@ -69,6 +69,8 @@ struct fwnode_handle;
* @lock_key: Lock class key for use by the lock validator
* @need_parent_lock: When probing or removing a device on this bus, the
* device core should lock the device's parent.
+ * @supports_site: Bus can differentiate between internal/external devices
+ * and thus supports the device "site" attribute.
*
* A bus is a channel between the processor and one or more devices. For the
* purposes of the device model, all devices are connected via a bus, even if
@@ -112,6 +114,7 @@ struct bus_type {
struct lock_class_key lock_key;

bool need_parent_lock;
+ bool supports_site;
};

extern int __must_check bus_register(struct bus_type *bus);
@@ -246,6 +249,11 @@ bus_find_device_by_acpi_dev(struct bus_type *bus, const void *adev)
}
#endif

+static inline bool bus_supports_site(struct bus_type *bus)
+{
+ return bus && bus->supports_site;
+}
+
struct device *subsys_find_device_by_id(struct bus_type *bus, unsigned int id,
struct device *hint);
int bus_for_each_drv(struct bus_type *bus, struct device_driver *start,
--
2.27.0.212.ge8ba1cc988-goog

2020-06-30 04:51:18

by Rajat Jain

[permalink] [raw]
Subject: [PATCH v2 7/7] PCI: Add parameter to disable attaching external devices

Introduce a PCI parameter that disables the automatic attachment of
external devices to their drivers.

This is needed to allow an admin to control which drivers he wants to
allow on external ports. For more context, see threads at:
https://lore.kernel.org/linux-pci/20200609210400.GA1461839@bjorn-Precision-5520/
https://lore.kernel.org/linux-pci/CACK8Z6H-DZQYBMqtU5_H5TTwwn35Q7Yysm9a7Wj0twfQP8QBzA@mail.gmail.com/

drivers_autoprobe can only be disabled after userspace comes up. So
any external devices that were plugged in before boot may still bind
to drivers before userspace gets a chance to clear drivers_autoprobe.
Another problem is that even with drivers_autoprobe=0, the hot-added
PCI devices are bound to drivers because PCI explicitly calls
device_attach() asking driver core to find and attach a driver. This
patch helps with both of these problems.

Signed-off-by: Rajat Jain <[email protected]>
---
v2: Use the newly introduced dev_is_external() from device core
commit log elaborated

drivers/pci/bus.c | 11 ++++++++---
drivers/pci/pci.c | 9 +++++++++
drivers/pci/pci.h | 1 +
3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 3cef835b375fd..c11725bccffb0 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -321,9 +321,14 @@ void pci_bus_add_device(struct pci_dev *dev)
pci_bridge_d3_update(dev);

dev->match_driver = true;
- retval = device_attach(&dev->dev);
- if (retval < 0 && retval != -EPROBE_DEFER)
- pci_warn(dev, "device attach failed (%d)\n", retval);
+
+ if (pci_dont_attach_external_devs && dev_is_external(&dev->dev)) {
+ pci_info(dev, "not attaching external device\n");
+ } else {
+ retval = device_attach(&dev->dev);
+ if (retval < 0 && retval != -EPROBE_DEFER)
+ pci_warn(dev, "device attach failed (%d)\n", retval);
+ }

pci_dev_assign_added(dev, true);
}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 35f25ac39167b..3ebcfa8b33178 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -128,6 +128,13 @@ static bool pcie_ats_disabled;
/* If set, the PCI config space of each device is printed during boot. */
bool pci_early_dump;

+/*
+ * If set, the devices behind external-facing bridges (as marked by firmware)
+ * shall not be attached automatically. Userspace will need to attach them
+ * manually: echo <pci device> > /sys/bus/pci/drivers/<driver>/bind
+ */
+bool pci_dont_attach_external_devs;
+
bool pci_ats_disabled(void)
{
return pcie_ats_disabled;
@@ -6539,6 +6546,8 @@ static int __init pci_setup(char *str)
pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
} else if (!strncmp(str, "disable_acs_redir=", 18)) {
disable_acs_redir_param = str + 18;
+ } else if (!strcmp(str, "dont_attach_external_devs")) {
+ pci_dont_attach_external_devs = true;
} else {
pr_err("PCI: Unknown option `%s'\n", str);
}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 12fb79fbe29d3..875fecb9b2612 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -13,6 +13,7 @@

extern const unsigned char pcie_link_speed[];
extern bool pci_early_dump;
+extern bool pci_dont_attach_external_devs;

bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
bool pcie_cap_has_rtctl(const struct pci_dev *dev);
--
2.27.0.212.ge8ba1cc988-goog

2020-06-30 04:51:23

by Rajat Jain

[permalink] [raw]
Subject: [PATCH v2 4/7] PCI: Add device even if driver attach failed

device_attach() returning failure indicates a driver error while trying to
probe the device. In such a scenario, the PCI device should still be added
in the system and be visible to the user.

This patch partially reverts:
commit ab1a187bba5c ("PCI: Check device_attach() return value always")

Signed-off-by: Rajat Jain <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
---
v2: Cosmetic change in commit log.
Add Greg's "reviewed-by"

drivers/pci/bus.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 8e40b3e6da77d..3cef835b375fd 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -322,12 +322,8 @@ void pci_bus_add_device(struct pci_dev *dev)

dev->match_driver = true;
retval = device_attach(&dev->dev);
- if (retval < 0 && retval != -EPROBE_DEFER) {
+ if (retval < 0 && retval != -EPROBE_DEFER)
pci_warn(dev, "device attach failed (%d)\n", retval);
- pci_proc_detach_device(dev);
- pci_remove_sysfs_dev_files(dev);
- return;
- }

pci_dev_assign_added(dev, true);
}
--
2.27.0.212.ge8ba1cc988-goog

2020-06-30 04:52:05

by Rajat Jain

[permalink] [raw]
Subject: [PATCH v2 3/7] PCI/ACS: Enable PCI_ACS_TB for untrusted/external-facing devices

When enabling ACS, enable translation blocking for external facing ports
and untrusted devices.

Signed-off-by: Rajat Jain <[email protected]>
---
v2: Commit log change

drivers/pci/pci.c | 4 ++++
drivers/pci/quirks.c | 11 +++++++++++
2 files changed, 15 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d2ff987585855..79853b52658a2 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3330,6 +3330,10 @@ static void pci_std_enable_acs(struct pci_dev *dev)
/* Upstream Forwarding */
ctrl |= (cap & PCI_ACS_UF);

+ if (dev->external_facing || dev->untrusted)
+ /* Translation Blocking */
+ ctrl |= (cap & PCI_ACS_TB);
+
pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
}

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index b341628e47527..6294adeac4049 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4934,6 +4934,13 @@ static void pci_quirk_enable_intel_rp_mpc_acs(struct pci_dev *dev)
}
}

+/*
+ * Currently this quirk does the equivalent of
+ * PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV
+ *
+ * Currently missing, it also needs to do equivalent of PCI_ACS_TB,
+ * if dev->external_facing || dev->untrusted
+ */
static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
{
if (!pci_quirk_intel_pch_acs_match(dev))
@@ -4973,6 +4980,10 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
ctrl |= (cap & PCI_ACS_CR);
ctrl |= (cap & PCI_ACS_UF);

+ if (dev->external_facing || dev->untrusted)
+ /* Translation Blocking */
+ ctrl |= (cap & PCI_ACS_TB);
+
pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl);

pci_info(dev, "Intel SPT PCH root port ACS workaround enabled\n");
--
2.27.0.212.ge8ba1cc988-goog

2020-06-30 04:53:33

by Rajat Jain

[permalink] [raw]
Subject: [PATCH v2 2/7] PCI: Set "untrusted" flag for truly external devices only

The "ExternalFacing" devices (root ports) are still internal devices that
sit on the internal system fabric and thus trusted. Currently they were
being marked untrusted.

This patch uses the platform flag to identify the external facing devices
and then use it to mark any downstream devices as "untrusted". The
external-facing devices themselves are left as "trusted". This was
discussed here: https://lkml.org/lkml/2020/6/10/1049

Signed-off-by: Rajat Jain <[email protected]>
---
v2: cosmetic changes in commit log

drivers/iommu/intel/iommu.c | 2 +-
drivers/pci/of.c | 2 +-
drivers/pci/pci-acpi.c | 13 +++++++------
drivers/pci/probe.c | 2 +-
include/linux/pci.h | 8 ++++++++
5 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d759e7234e982..1ccb224f82496 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4743,7 +4743,7 @@ static inline bool has_untrusted_dev(void)
struct pci_dev *pdev = NULL;

for_each_pci_dev(pdev)
- if (pdev->untrusted)
+ if (pdev->untrusted || pdev->external_facing)
return true;

return false;
diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 27839cd2459f6..22727fc9558df 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -42,7 +42,7 @@ void pci_set_bus_of_node(struct pci_bus *bus)
} else {
node = of_node_get(bus->self->dev.of_node);
if (node && of_property_read_bool(node, "external-facing"))
- bus->self->untrusted = true;
+ bus->self->external_facing = true;
}

bus->dev.of_node = node;
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 7224b1e5f2a83..492c07805caf8 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -1213,22 +1213,23 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev,
ACPI_FREE(obj);
}

-static void pci_acpi_set_untrusted(struct pci_dev *dev)
+static void pci_acpi_set_external_facing(struct pci_dev *dev)
{
u8 val;

- if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT)
+ if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT &&
+ pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM)
return;
if (device_property_read_u8(&dev->dev, "ExternalFacingPort", &val))
return;

/*
- * These root ports expose PCIe (including DMA) outside of the
- * system so make sure we treat them and everything behind as
+ * These root/down ports expose PCIe (including DMA) outside of the
+ * system so make sure we treat everything behind them as
* untrusted.
*/
if (val)
- dev->untrusted = 1;
+ dev->external_facing = 1;
}

static void pci_acpi_setup(struct device *dev)
@@ -1240,7 +1241,7 @@ static void pci_acpi_setup(struct device *dev)
return;

pci_acpi_optimize_delay(pci_dev, adev->handle);
- pci_acpi_set_untrusted(pci_dev);
+ pci_acpi_set_external_facing(pci_dev);
pci_acpi_add_edr_notifier(pci_dev);

pci_acpi_add_pm_notifier(adev, pci_dev);
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 6d87066a5ecc5..8c40c00413e74 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1552,7 +1552,7 @@ static void set_pcie_untrusted(struct pci_dev *dev)
* untrusted as well.
*/
parent = pci_upstream_bridge(dev);
- if (parent && parent->untrusted)
+ if (parent && (parent->untrusted || parent->external_facing))
dev->untrusted = true;
}

diff --git a/include/linux/pci.h b/include/linux/pci.h
index a26be5332bba6..fe1bc603fda40 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -432,6 +432,14 @@ struct pci_dev {
* mappings to make sure they cannot access arbitrary memory.
*/
unsigned int untrusted:1;
+ /*
+ * Devices are marked as external-facing using info from platform
+ * (ACPI / devicetree). An external-facing device is still an internal
+ * trusted device, but it faces external untrusted devices. Thus any
+ * devices enumerated downstream an external-facing device is marked
+ * as untrusted.
+ */
+ unsigned int external_facing:1;
unsigned int broken_intx_masking:1; /* INTx masking can't be used */
unsigned int io_window_1k:1; /* Intel bridge 1K I/O windows */
unsigned int irq_managed:1;
--
2.27.0.212.ge8ba1cc988-goog

2020-06-30 07:40:14

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] PCI: Set "untrusted" flag for truly external devices only

On 2020/6/30 12:49, Rajat Jain wrote:
> The "ExternalFacing" devices (root ports) are still internal devices that
> sit on the internal system fabric and thus trusted. Currently they were
> being marked untrusted.
>
> This patch uses the platform flag to identify the external facing devices
> and then use it to mark any downstream devices as "untrusted". The
> external-facing devices themselves are left as "trusted". This was
> discussed here: https://lkml.org/lkml/2020/6/10/1049
>
> Signed-off-by: Rajat Jain <[email protected]>

For changes in Intel VT-d driver,

Reviewed-by: Lu Baolu <[email protected]>

Best regards,
baolu

> ---
> v2: cosmetic changes in commit log
>
> drivers/iommu/intel/iommu.c | 2 +-
> drivers/pci/of.c | 2 +-
> drivers/pci/pci-acpi.c | 13 +++++++------
> drivers/pci/probe.c | 2 +-
> include/linux/pci.h | 8 ++++++++
> 5 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index d759e7234e982..1ccb224f82496 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4743,7 +4743,7 @@ static inline bool has_untrusted_dev(void)
> struct pci_dev *pdev = NULL;
>
> for_each_pci_dev(pdev)
> - if (pdev->untrusted)
> + if (pdev->untrusted || pdev->external_facing)
> return true;
>
> return false;
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 27839cd2459f6..22727fc9558df 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -42,7 +42,7 @@ void pci_set_bus_of_node(struct pci_bus *bus)
> } else {
> node = of_node_get(bus->self->dev.of_node);
> if (node && of_property_read_bool(node, "external-facing"))
> - bus->self->untrusted = true;
> + bus->self->external_facing = true;
> }
>
> bus->dev.of_node = node;
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 7224b1e5f2a83..492c07805caf8 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -1213,22 +1213,23 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev,
> ACPI_FREE(obj);
> }
>
> -static void pci_acpi_set_untrusted(struct pci_dev *dev)
> +static void pci_acpi_set_external_facing(struct pci_dev *dev)
> {
> u8 val;
>
> - if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT)
> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT &&
> + pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM)
> return;
> if (device_property_read_u8(&dev->dev, "ExternalFacingPort", &val))
> return;
>
> /*
> - * These root ports expose PCIe (including DMA) outside of the
> - * system so make sure we treat them and everything behind as
> + * These root/down ports expose PCIe (including DMA) outside of the
> + * system so make sure we treat everything behind them as
> * untrusted.
> */
> if (val)
> - dev->untrusted = 1;
> + dev->external_facing = 1;
> }
>
> static void pci_acpi_setup(struct device *dev)
> @@ -1240,7 +1241,7 @@ static void pci_acpi_setup(struct device *dev)
> return;
>
> pci_acpi_optimize_delay(pci_dev, adev->handle);
> - pci_acpi_set_untrusted(pci_dev);
> + pci_acpi_set_external_facing(pci_dev);
> pci_acpi_add_edr_notifier(pci_dev);
>
> pci_acpi_add_pm_notifier(adev, pci_dev);
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 6d87066a5ecc5..8c40c00413e74 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1552,7 +1552,7 @@ static void set_pcie_untrusted(struct pci_dev *dev)
> * untrusted as well.
> */
> parent = pci_upstream_bridge(dev);
> - if (parent && parent->untrusted)
> + if (parent && (parent->untrusted || parent->external_facing))
> dev->untrusted = true;
> }
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index a26be5332bba6..fe1bc603fda40 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -432,6 +432,14 @@ struct pci_dev {
> * mappings to make sure they cannot access arbitrary memory.
> */
> unsigned int untrusted:1;
> + /*
> + * Devices are marked as external-facing using info from platform
> + * (ACPI / devicetree). An external-facing device is still an internal
> + * trusted device, but it faces external untrusted devices. Thus any
> + * devices enumerated downstream an external-facing device is marked
> + * as untrusted.
> + */
> + unsigned int external_facing:1;
> unsigned int broken_intx_masking:1; /* INTx masking can't be used */
> unsigned int io_window_1k:1; /* Intel bridge 1K I/O windows */
> unsigned int irq_managed:1;
>

2020-06-30 07:43:23

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH v2 6/7] PCI: Move pci_dev->untrusted logic to use device location instead

On 2020/6/30 12:49, Rajat Jain wrote:
> The firmware was provinding "ExternalFacing" attribute on PCI root ports,
> to allow the kernel to mark devices behind it as external. Note that the
> firmware provides an immutable, read-only property, i.e. the location of
> the device.
>
> The use of (external) device location as hint for (dis)trust, is a
> decision that IOMMU drivers have taken, so we should call it out
> explicitly.
>
> This patch removes the pci_dev->untrusted and changes the users of it to
> use device core provided device location instead. This location is
> populated by PCI using the same "ExternalFacing" firmware info. Any
> device not behind the "ExternalFacing" bridges are marked internal and
> the ones behind such bridges are markes external.
>
> Signed-off-by: Rajat Jain <[email protected]>

For changes in Intel VT-d driver,

Reviewed-by: Lu Baolu <[email protected]>

Best regards,
baolu

> ---
> v2: (Initial version)
>
> drivers/iommu/intel/iommu.c | 31 +++++++++++++++++++++----------
> drivers/pci/ats.c | 2 +-
> drivers/pci/pci-driver.c | 1 +
> drivers/pci/pci.c | 2 +-
> drivers/pci/probe.c | 18 ++++++++++++------
> drivers/pci/quirks.c | 2 +-
> include/linux/pci.h | 10 +---------
> 7 files changed, 38 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 1ccb224f82496..ca66a196f5e97 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -168,6 +168,22 @@ static inline unsigned long virt_to_dma_pfn(void *p)
> return page_to_dma_pfn(virt_to_page(p));
> }
>
> +static inline bool untrusted_dev(struct device *dev)
> +{
> + /*
> + * Treat all external PCI devices as untrusted devices. These are the
> + * devices behing marked behind external-facing bridges as marked by
> + * the firmware. The untrusted devices are the ones that can potentially
> + * execute DMA attacks and similar. They are typically connected through
> + * external thunderbolt ports. When an IOMMU is enabled they should be
> + * getting full mappings to ensure they cannot access arbitrary memory.
> + */
> + if (dev_is_pci(dev) && dev_is_external(dev))
> + return true;
> +
> + return false;
> +}
> +
> /* global iommu list, set NULL for ignored DMAR units */
> static struct intel_iommu **g_iommus;
>
> @@ -383,8 +399,7 @@ struct device_domain_info *get_domain_info(struct device *dev)
> DEFINE_SPINLOCK(device_domain_lock);
> static LIST_HEAD(device_domain_list);
>
> -#define device_needs_bounce(d) (!intel_no_bounce && dev_is_pci(d) && \
> - to_pci_dev(d)->untrusted)
> +#define device_needs_bounce(d) (!intel_no_bounce && untrusted_dev(d))
>
> /*
> * Iterate over elements in device_domain_list and call the specified
> @@ -2830,7 +2845,7 @@ static int device_def_domain_type(struct device *dev)
> * Prevent any device marked as untrusted from getting
> * placed into the statically identity mapping domain.
> */
> - if (pdev->untrusted)
> + if (untrusted_dev(dev))
> return IOMMU_DOMAIN_DMA;
>
> if ((iommu_identity_mapping & IDENTMAP_AZALIA) && IS_AZALIA(pdev))
> @@ -3464,7 +3479,6 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
> unsigned long iova_pfn;
> struct intel_iommu *iommu;
> struct page *freelist;
> - struct pci_dev *pdev = NULL;
>
> domain = find_domain(dev);
> BUG_ON(!domain);
> @@ -3477,11 +3491,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
> start_pfn = mm_to_dma_pfn(iova_pfn);
> last_pfn = start_pfn + nrpages - 1;
>
> - if (dev_is_pci(dev))
> - pdev = to_pci_dev(dev);
> -
> freelist = domain_unmap(domain, start_pfn, last_pfn);
> - if (intel_iommu_strict || (pdev && pdev->untrusted) ||
> + if (intel_iommu_strict || untrusted_dev(dev) ||
> !has_iova_flush_queue(&domain->iovad)) {
> iommu_flush_iotlb_psi(iommu, domain, start_pfn,
> nrpages, !freelist, 0);
> @@ -4743,7 +4754,7 @@ static inline bool has_untrusted_dev(void)
> struct pci_dev *pdev = NULL;
>
> for_each_pci_dev(pdev)
> - if (pdev->untrusted || pdev->external_facing)
> + if (pdev->external_facing || untrusted_dev(&pdev->dev))
> return true;
>
> return false;
> @@ -6036,7 +6047,7 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
> */
> static bool risky_device(struct pci_dev *pdev)
> {
> - if (pdev->untrusted) {
> + if (untrusted_dev(&pdev->dev)) {
> pci_info(pdev,
> "Skipping IOMMU quirk for dev [%04X:%04X] on untrusted PCI link\n",
> pdev->vendor, pdev->device);
> diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
> index b761c1f72f672..ebd370f4d5b06 100644
> --- a/drivers/pci/ats.c
> +++ b/drivers/pci/ats.c
> @@ -42,7 +42,7 @@ bool pci_ats_supported(struct pci_dev *dev)
> if (!dev->ats_cap)
> return false;
>
> - return (dev->untrusted == 0);
> + return (!dev_is_external(&dev->dev));
> }
> EXPORT_SYMBOL_GPL(pci_ats_supported);
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index da6510af12214..9608053a8a62c 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1630,6 +1630,7 @@ struct bus_type pci_bus_type = {
> .pm = PCI_PM_OPS_PTR,
> .num_vf = pci_bus_num_vf,
> .dma_configure = pci_dma_configure,
> + .supports_site = true,
> };
> EXPORT_SYMBOL(pci_bus_type);
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 79853b52658a2..35f25ac39167b 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3330,7 +3330,7 @@ static void pci_std_enable_acs(struct pci_dev *dev)
> /* Upstream Forwarding */
> ctrl |= (cap & PCI_ACS_UF);
>
> - if (dev->external_facing || dev->untrusted)
> + if (dev->external_facing || dev_is_external(&dev->dev))
> /* Translation Blocking */
> ctrl |= (cap & PCI_ACS_TB);
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 8c40c00413e74..1609329cc5b4e 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1543,17 +1543,23 @@ static void set_pcie_thunderbolt(struct pci_dev *dev)
> }
> }
>
> -static void set_pcie_untrusted(struct pci_dev *dev)
> +static void set_pcie_dev_site(struct pci_dev *dev)
> {
> struct pci_dev *parent;
>
> /*
> - * If the upstream bridge is untrusted we treat this device
> - * untrusted as well.
> + * All devices are considered internal by default, unless behind an
> + * external-facing bridge, as marked by the firmware.
> + */
> + dev_set_site(&dev->dev, SITE_INTERNAL);
> +
> + /*
> + * If the upstream bridge is external or external-facing, this device
> + * is also external.
> */
> parent = pci_upstream_bridge(dev);
> - if (parent && (parent->untrusted || parent->external_facing))
> - dev->untrusted = true;
> + if (parent && (parent->external_facing || dev_is_external(&parent->dev)))
> + dev_set_site(&dev->dev, SITE_EXTERNAL);
> }
>
> /**
> @@ -1814,7 +1820,7 @@ int pci_setup_device(struct pci_dev *dev)
> /* Need to have dev->cfg_size ready */
> set_pcie_thunderbolt(dev);
>
> - set_pcie_untrusted(dev);
> + set_pcie_dev_site(dev);
>
> /* "Unknown power state" */
> dev->current_state = PCI_UNKNOWN;
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 6294adeac4049..65d0b8745c915 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4980,7 +4980,7 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
> ctrl |= (cap & PCI_ACS_CR);
> ctrl |= (cap & PCI_ACS_UF);
>
> - if (dev->external_facing || dev->untrusted)
> + if (dev->external_facing || dev_is_external(&dev->dev))
> /* Translation Blocking */
> ctrl |= (cap & PCI_ACS_TB);
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index fe1bc603fda40..8bb5065e5aed2 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -424,20 +424,12 @@ struct pci_dev {
> unsigned int is_hotplug_bridge:1;
> unsigned int shpc_managed:1; /* SHPC owned by shpchp */
> unsigned int is_thunderbolt:1; /* Thunderbolt controller */
> - /*
> - * Devices marked being untrusted are the ones that can potentially
> - * execute DMA attacks and similar. They are typically connected
> - * through external ports such as Thunderbolt but not limited to
> - * that. When an IOMMU is enabled they should be getting full
> - * mappings to make sure they cannot access arbitrary memory.
> - */
> - unsigned int untrusted:1;
> /*
> * Devices are marked as external-facing using info from platform
> * (ACPI / devicetree). An external-facing device is still an internal
> * trusted device, but it faces external untrusted devices. Thus any
> * devices enumerated downstream an external-facing device is marked
> - * as untrusted.
> + * as external device.
> */
> unsigned int external_facing:1;
> unsigned int broken_intx_masking:1; /* INTx masking can't be used */
>

2020-06-30 08:02:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] PCI: Set "untrusted" flag for truly external devices only

On Mon, Jun 29, 2020 at 09:49:38PM -0700, Rajat Jain wrote:
> The "ExternalFacing" devices (root ports) are still internal devices that
> sit on the internal system fabric and thus trusted. Currently they were
> being marked untrusted.
>
> This patch uses the platform flag to identify the external facing devices
> and then use it to mark any downstream devices as "untrusted". The
> external-facing devices themselves are left as "trusted". This was
> discussed here: https://lkml.org/lkml/2020/6/10/1049

{sigh}

First off, please use lore.kernel.org links, we don't control lkml.org
and it often times has been down.

Also, you need to put all of the information in the changelog, referring
to another place isn't always the best thing, considering you will be
looking this up in 20+ years to try to figure out why people came up
with such a crazy design.

But, the main point is, no, we did not decide on this. "trust" is a
policy decision to make by userspace, it is independant of "location",
while you are tieing it directly here, which is what I explicitly said
NOT to do.

So again, no, I will NAK this patch as-is, sorry, you are mixing things
together in a way that it should not do at this point in time.

greg k-h

2020-06-30 08:05:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Mon, Jun 29, 2020 at 09:49:41PM -0700, Rajat Jain wrote:
> Add a new (optional) field to denote the physical location of a device
> in the system, and expose it in sysfs. This was discussed here:
> https://lore.kernel.org/linux-acpi/[email protected]/
>
> (The primary choice for attribute name i.e. "location" is already
> exposed as an ABI elsewhere, so settled for "site").

Where is "location" exported? I see one USB port sysfs attribute, is
that what you are worried about here?

> Individual buses
> that want to support this new attribute can opt-in by setting a flag in
> bus_type, and then populating the location of device while enumerating
> it.
>
> Signed-off-by: Rajat Jain <[email protected]>
> ---
> v2: (Initial version)
>
> drivers/base/core.c | 35 +++++++++++++++++++++++++++++++
> include/linux/device.h | 42 ++++++++++++++++++++++++++++++++++++++
> include/linux/device/bus.h | 8 ++++++++
> 3 files changed, 85 insertions(+)


No Documentation/ABI/ update for this new attribute? Why not?

>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 67d39a90b45c7..14c815526b7fa 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1778,6 +1778,32 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr,
> }
> static DEVICE_ATTR_RW(online);
>
> +static ssize_t site_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + const char *site;
> +
> + device_lock(dev);
> + switch (dev->site) {
> + case SITE_INTERNAL:
> + site = "INTERNAL";
> + break;
> + case SITE_EXTENDED:
> + site = "EXTENDED";
> + break;
> + case SITE_EXTERNAL:
> + site = "EXTERNAL";
> + break;
> + case SITE_UNKNOWN:
> + default:
> + site = "UNKNOWN";
> + break;
> + }
> + device_unlock(dev);

Why are you locking/unlocking a device here?

You have a reference count on the structure, are you worried about
something else changing here on it? If so, what? You aren't locking it
when the state is set (which is fine, really, you shouldn't need to.)


> + return sprintf(buf, "%s\n", site);
> +}
> +static DEVICE_ATTR_RO(site);
> +
> int device_add_groups(struct device *dev, const struct attribute_group **groups)
> {
> return sysfs_create_groups(&dev->kobj, groups);
> @@ -1949,8 +1975,16 @@ static int device_add_attrs(struct device *dev)
> goto err_remove_dev_groups;
> }
>
> + if (bus_supports_site(dev->bus)) {
> + error = device_create_file(dev, &dev_attr_site);
> + if (error)
> + goto err_remove_dev_attr_online;
> + }
> +
> return 0;
>
> + err_remove_dev_attr_online:
> + device_remove_file(dev, &dev_attr_online);
> err_remove_dev_groups:
> device_remove_groups(dev, dev->groups);
> err_remove_type_groups:
> @@ -1968,6 +2002,7 @@ static void device_remove_attrs(struct device *dev)
> struct class *class = dev->class;
> const struct device_type *type = dev->type;
>
> + device_remove_file(dev, &dev_attr_site);
> device_remove_file(dev, &dev_attr_online);
> device_remove_groups(dev, dev->groups);
>
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 15460a5ac024a..a4143735ae712 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -428,6 +428,31 @@ enum dl_dev_state {
> DL_DEV_UNBINDING,
> };
>
> +/**
> + * enum device_site - Physical location of the device in the system.
> + * The semantics of values depend on subsystem / bus:
> + *
> + * @SITE_UNKNOWN: Location is Unknown (default)
> + *
> + * @SITE_INTERNAL: Device is internal to the system, and cannot be (easily)
> + * removed. E.g. SoC internal devices, onboard soldered
> + * devices, internal M.2 cards (that cannot be removed
> + * without opening the chassis).
> + * @SITE_EXTENDED: Device sits an extension of the system. E.g. devices
> + * on external PCIe trays, docking stations etc. These
> + * devices may be removable, but are generally housed
> + * internally on an extension board, so they are removed
> + * only when that whole extension board is removed.
> + * @SITE_EXTERNAL: Devices truly external to the system (i.e. plugged on
> + * an external port) that may be removed or added frequently.
> + */
> +enum device_site {
> + SITE_UNKNOWN = 0,
> + SITE_INTERNAL,
> + SITE_EXTENDED,
> + SITE_EXTERNAL,
> +};
> +
> /**
> * struct dev_links_info - Device data related to device links.
> * @suppliers: List of links to supplier devices.
> @@ -513,6 +538,7 @@ struct dev_links_info {
> * device (i.e. the bus driver that discovered the device).
> * @iommu_group: IOMMU group the device belongs to.
> * @iommu: Per device generic IOMMU runtime data
> + * @site: Physical location of the device w.r.t. the system
> *
> * @offline_disabled: If set, the device is permanently online.
> * @offline: Set after successful invocation of bus type's .offline().
> @@ -613,6 +639,8 @@ struct device {
> struct iommu_group *iommu_group;
> struct dev_iommu *iommu;
>
> + enum device_site site; /* Device physical location */
> +
> bool offline_disabled:1;
> bool offline:1;
> bool of_node_reused:1;
> @@ -806,6 +834,20 @@ static inline bool dev_has_sync_state(struct device *dev)
> return false;
> }
>
> +static inline int dev_set_site(struct device *dev, enum device_site site)
> +{
> + if (site < SITE_UNKNOWN || site > SITE_EXTERNAL)
> + return -EINVAL;

It's an enum, why check the range?

thanks,

greg k-h

2020-06-30 08:06:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] PCI: Add device even if driver attach failed

On Mon, Jun 29, 2020 at 09:49:40PM -0700, Rajat Jain wrote:
> device_attach() returning failure indicates a driver error while trying to
> probe the device. In such a scenario, the PCI device should still be added
> in the system and be visible to the user.
>
> This patch partially reverts:
> commit ab1a187bba5c ("PCI: Check device_attach() return value always")
>
> Signed-off-by: Rajat Jain <[email protected]>
> Reviewed-by: Greg Kroah-Hartman <[email protected]>
> ---
> v2: Cosmetic change in commit log.
> Add Greg's "reviewed-by"
>
> drivers/pci/bus.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 8e40b3e6da77d..3cef835b375fd 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -322,12 +322,8 @@ void pci_bus_add_device(struct pci_dev *dev)
>
> dev->match_driver = true;
> retval = device_attach(&dev->dev);
> - if (retval < 0 && retval != -EPROBE_DEFER) {
> + if (retval < 0 && retval != -EPROBE_DEFER)
> pci_warn(dev, "device attach failed (%d)\n", retval);
> - pci_proc_detach_device(dev);
> - pci_remove_sysfs_dev_files(dev);
> - return;
> - }
>
> pci_dev_assign_added(dev, true);
> }

This should go first in the series, and cc: stable and get merged now.
No need to tie it to this series at all.

Or just an independant patch, it doesn't have much to do with this
series, it's a bugfix.

thanks,

greg k-h

2020-06-30 10:50:47

by Heikki Krogerus

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Mon, Jun 29, 2020 at 09:49:41PM -0700, Rajat Jain wrote:
> Add a new (optional) field to denote the physical location of a device
> in the system, and expose it in sysfs. This was discussed here:
> https://lore.kernel.org/linux-acpi/[email protected]/
>
> (The primary choice for attribute name i.e. "location" is already
> exposed as an ABI elsewhere, so settled for "site"). Individual buses
> that want to support this new attribute can opt-in by setting a flag in
> bus_type, and then populating the location of device while enumerating
> it.

So why not just call it "physical_location"?


thanks,

--
heikki

2020-06-30 12:53:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Tue, Jun 30, 2020 at 01:49:48PM +0300, Heikki Krogerus wrote:
> On Mon, Jun 29, 2020 at 09:49:41PM -0700, Rajat Jain wrote:
> > Add a new (optional) field to denote the physical location of a device
> > in the system, and expose it in sysfs. This was discussed here:
> > https://lore.kernel.org/linux-acpi/[email protected]/
> >
> > (The primary choice for attribute name i.e. "location" is already
> > exposed as an ABI elsewhere, so settled for "site"). Individual buses
> > that want to support this new attribute can opt-in by setting a flag in
> > bus_type, and then populating the location of device while enumerating
> > it.
>
> So why not just call it "physical_location"?

That's better, and will allow us to put "3rd blue plug from the left,
4th row down" in there someday :)

All of this is "relative" to the CPU, right? But what CPU? Again, how
are the systems with drawers of PCI and CPUs and memory that can be
added/removed at any point in time being handled here? What is
"internal" and "external" for them?

What exactly is the physical boundry here that is attempting to be
described?

thanks,

greg "not all the world is your laptop" k-h

2020-06-30 13:04:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Tue, Jun 30, 2020 at 2:52 PM Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Tue, Jun 30, 2020 at 01:49:48PM +0300, Heikki Krogerus wrote:
> > On Mon, Jun 29, 2020 at 09:49:41PM -0700, Rajat Jain wrote:
> > > Add a new (optional) field to denote the physical location of a device
> > > in the system, and expose it in sysfs. This was discussed here:
> > > https://lore.kernel.org/linux-acpi/[email protected]/
> > >
> > > (The primary choice for attribute name i.e. "location" is already
> > > exposed as an ABI elsewhere, so settled for "site"). Individual buses
> > > that want to support this new attribute can opt-in by setting a flag in
> > > bus_type, and then populating the location of device while enumerating
> > > it.
> >
> > So why not just call it "physical_location"?
>
> That's better, and will allow us to put "3rd blue plug from the left,
> 4th row down" in there someday :)
>
> All of this is "relative" to the CPU, right? But what CPU? Again, how
> are the systems with drawers of PCI and CPUs and memory that can be
> added/removed at any point in time being handled here? What is
> "internal" and "external" for them?
>
> What exactly is the physical boundry here that is attempting to be
> described?

Also, where is the "physical location" information going to come from?

If that is the platform firmware (which I suspect is the anticipated
case), there may be problems with reliability related to that.

2020-06-30 15:39:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Tue, Jun 30, 2020 at 03:00:34PM +0200, Rafael J. Wysocki wrote:
> On Tue, Jun 30, 2020 at 2:52 PM Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > On Tue, Jun 30, 2020 at 01:49:48PM +0300, Heikki Krogerus wrote:
> > > On Mon, Jun 29, 2020 at 09:49:41PM -0700, Rajat Jain wrote:
> > > > Add a new (optional) field to denote the physical location of a device
> > > > in the system, and expose it in sysfs. This was discussed here:
> > > > https://lore.kernel.org/linux-acpi/[email protected]/
> > > >
> > > > (The primary choice for attribute name i.e. "location" is already
> > > > exposed as an ABI elsewhere, so settled for "site"). Individual buses
> > > > that want to support this new attribute can opt-in by setting a flag in
> > > > bus_type, and then populating the location of device while enumerating
> > > > it.
> > >
> > > So why not just call it "physical_location"?
> >
> > That's better, and will allow us to put "3rd blue plug from the left,
> > 4th row down" in there someday :)
> >
> > All of this is "relative" to the CPU, right? But what CPU? Again, how
> > are the systems with drawers of PCI and CPUs and memory that can be
> > added/removed at any point in time being handled here? What is
> > "internal" and "external" for them?
> >
> > What exactly is the physical boundry here that is attempting to be
> > described?
>
> Also, where is the "physical location" information going to come from?

Who knows? :)

Some BIOS seem to provide this, but do you trust that?

> If that is the platform firmware (which I suspect is the anticipated
> case), there may be problems with reliability related to that.

s/may/will/

which means making the kernel inact a policy like this patch series
tries to add, will result in a lot of broken systems, which is why I
keep saying that it needs to be done in userspace.

It's as if some of us haven't been down this road before and just keep
being ignored...

{sigh}

greg k-h

2020-06-30 16:12:04

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Tue, Jun 30, 2020 at 5:38 PM Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Tue, Jun 30, 2020 at 03:00:34PM +0200, Rafael J. Wysocki wrote:
> > On Tue, Jun 30, 2020 at 2:52 PM Greg Kroah-Hartman
> > <[email protected]> wrote:
> > >
> > > On Tue, Jun 30, 2020 at 01:49:48PM +0300, Heikki Krogerus wrote:
> > > > On Mon, Jun 29, 2020 at 09:49:41PM -0700, Rajat Jain wrote:
> > > > > Add a new (optional) field to denote the physical location of a device
> > > > > in the system, and expose it in sysfs. This was discussed here:
> > > > > https://lore.kernel.org/linux-acpi/[email protected]/
> > > > >
> > > > > (The primary choice for attribute name i.e. "location" is already
> > > > > exposed as an ABI elsewhere, so settled for "site"). Individual buses
> > > > > that want to support this new attribute can opt-in by setting a flag in
> > > > > bus_type, and then populating the location of device while enumerating
> > > > > it.
> > > >
> > > > So why not just call it "physical_location"?
> > >
> > > That's better, and will allow us to put "3rd blue plug from the left,
> > > 4th row down" in there someday :)
> > >
> > > All of this is "relative" to the CPU, right? But what CPU? Again, how
> > > are the systems with drawers of PCI and CPUs and memory that can be
> > > added/removed at any point in time being handled here? What is
> > > "internal" and "external" for them?
> > >
> > > What exactly is the physical boundry here that is attempting to be
> > > described?
> >
> > Also, where is the "physical location" information going to come from?
>
> Who knows? :)
>
> Some BIOS seem to provide this, but do you trust that?
>
> > If that is the platform firmware (which I suspect is the anticipated
> > case), there may be problems with reliability related to that.
>
> s/may/will/
>
> which means making the kernel inact a policy like this patch series
> tries to add, will result in a lot of broken systems, which is why I
> keep saying that it needs to be done in userspace.
>
> It's as if some of us haven't been down this road before and just keep
> being ignored...
>
> {sigh}

Well, to be honest, if you are a "vertical" vendor and you control the
entire stack, *including* the platform firmware, it would be kind of
OK for you to do that in a product kernel.

However, this is not a practical thing to do in the mainline kernel
which must work for everybody, including people who happen to use
systems with broken or even actively unfriendly firmware on them.

So I'm inclined to say that IMO this series "as is" would not be an
improvement from the mainline perspective.

I guess it would make sense to have an attribute for user space to
write to in order to make the kernel reject device plug-in events
coming from a given port or connector, but the kernel has no reliable
means to determine *which* ports or connectors are "safe", and even if
there was a way for it to do that, it still may not agree with user
space on which ports or connectors should be regarded as "safe".

Cheers!

2020-06-30 18:40:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Tue, Jun 30, 2020 at 06:08:31PM +0200, Rafael J. Wysocki wrote:
> On Tue, Jun 30, 2020 at 5:38 PM Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > On Tue, Jun 30, 2020 at 03:00:34PM +0200, Rafael J. Wysocki wrote:
> > > On Tue, Jun 30, 2020 at 2:52 PM Greg Kroah-Hartman
> > > <[email protected]> wrote:
> > > >
> > > > On Tue, Jun 30, 2020 at 01:49:48PM +0300, Heikki Krogerus wrote:
> > > > > On Mon, Jun 29, 2020 at 09:49:41PM -0700, Rajat Jain wrote:
> > > > > > Add a new (optional) field to denote the physical location of a device
> > > > > > in the system, and expose it in sysfs. This was discussed here:
> > > > > > https://lore.kernel.org/linux-acpi/[email protected]/
> > > > > >
> > > > > > (The primary choice for attribute name i.e. "location" is already
> > > > > > exposed as an ABI elsewhere, so settled for "site"). Individual buses
> > > > > > that want to support this new attribute can opt-in by setting a flag in
> > > > > > bus_type, and then populating the location of device while enumerating
> > > > > > it.
> > > > >
> > > > > So why not just call it "physical_location"?
> > > >
> > > > That's better, and will allow us to put "3rd blue plug from the left,
> > > > 4th row down" in there someday :)
> > > >
> > > > All of this is "relative" to the CPU, right? But what CPU? Again, how
> > > > are the systems with drawers of PCI and CPUs and memory that can be
> > > > added/removed at any point in time being handled here? What is
> > > > "internal" and "external" for them?
> > > >
> > > > What exactly is the physical boundry here that is attempting to be
> > > > described?
> > >
> > > Also, where is the "physical location" information going to come from?
> >
> > Who knows? :)
> >
> > Some BIOS seem to provide this, but do you trust that?
> >
> > > If that is the platform firmware (which I suspect is the anticipated
> > > case), there may be problems with reliability related to that.
> >
> > s/may/will/
> >
> > which means making the kernel inact a policy like this patch series
> > tries to add, will result in a lot of broken systems, which is why I
> > keep saying that it needs to be done in userspace.
> >
> > It's as if some of us haven't been down this road before and just keep
> > being ignored...
> >
> > {sigh}
>
> Well, to be honest, if you are a "vertical" vendor and you control the
> entire stack, *including* the platform firmware, it would be kind of
> OK for you to do that in a product kernel.
>
> However, this is not a practical thing to do in the mainline kernel
> which must work for everybody, including people who happen to use
> systems with broken or even actively unfriendly firmware on them.
>
> So I'm inclined to say that IMO this series "as is" would not be an
> improvement from the mainline perspective.

It can be, we have been using this for USB devices for many many years
now, quite successfully. The key is not to trust that the platform
firmware got it right :)

> I guess it would make sense to have an attribute for user space to
> write to in order to make the kernel reject device plug-in events
> coming from a given port or connector, but the kernel has no reliable
> means to determine *which* ports or connectors are "safe", and even if
> there was a way for it to do that, it still may not agree with user
> space on which ports or connectors should be regarded as "safe".

Again, we have been doing this for USB devices for a very long time, PCI
shouldn't be any different. Why people keep ignoring working solutions
is beyond me, there's nothing "special" about PCI devices here for this
type of "worry" or reasoning to try to create new solutions.

So, again, I ask, go do what USB does, and to do that, take the logic
out of the USB core, make it bus-agnositic, and _THEN_ add it to the PCI
code. Why the original submitter keeps ignoring my request to do this
is beyond me, I guess they like making patches that will get rejected :(

thanks,

greg k-h

2020-06-30 19:04:40

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Mon, Jun 29, 2020 at 9:49 PM Rajat Jain <[email protected]> wrote:
>
> Add a new (optional) field to denote the physical location of a device
> in the system, and expose it in sysfs. This was discussed here:
> https://lore.kernel.org/linux-acpi/[email protected]/
>
> (The primary choice for attribute name i.e. "location" is already
> exposed as an ABI elsewhere, so settled for "site"). Individual buses
> that want to support this new attribute can opt-in by setting a flag in
> bus_type, and then populating the location of device while enumerating
> it.
>
> Signed-off-by: Rajat Jain <[email protected]>
> ---
> v2: (Initial version)
>
> drivers/base/core.c | 35 +++++++++++++++++++++++++++++++
> include/linux/device.h | 42 ++++++++++++++++++++++++++++++++++++++
> include/linux/device/bus.h | 8 ++++++++
> 3 files changed, 85 insertions(+)
>

<snip> I'm not CC'ed in 4/7, so just replying

> diff --git a/include/linux/device.h b/include/linux/device.h
> index 15460a5ac024a..a4143735ae712 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -428,6 +428,31 @@ enum dl_dev_state {
> DL_DEV_UNBINDING,
> };
>
> +/**
> + * enum device_site - Physical location of the device in the system.
> + * The semantics of values depend on subsystem / bus:
> + *
> + * @SITE_UNKNOWN: Location is Unknown (default)
> + *
> + * @SITE_INTERNAL: Device is internal to the system, and cannot be (easily)
> + * removed. E.g. SoC internal devices, onboard soldered
> + * devices, internal M.2 cards (that cannot be removed
> + * without opening the chassis).
> + * @SITE_EXTENDED: Device sits an extension of the system. E.g. devices
> + * on external PCIe trays, docking stations etc. These
> + * devices may be removable, but are generally housed
> + * internally on an extension board, so they are removed
> + * only when that whole extension board is removed.
> + * @SITE_EXTERNAL: Devices truly external to the system (i.e. plugged on
> + * an external port) that may be removed or added frequently.
> + */
> +enum device_site {
> + SITE_UNKNOWN = 0,
> + SITE_INTERNAL,
> + SITE_EXTENDED,
> + SITE_EXTERNAL,
> +};
> +
> /**
> * struct dev_links_info - Device data related to device links.
> * @suppliers: List of links to supplier devices.
> @@ -513,6 +538,7 @@ struct dev_links_info {
> * device (i.e. the bus driver that discovered the device).
> * @iommu_group: IOMMU group the device belongs to.
> * @iommu: Per device generic IOMMU runtime data
> + * @site: Physical location of the device w.r.t. the system
> *
> * @offline_disabled: If set, the device is permanently online.
> * @offline: Set after successful invocation of bus type's .offline().
> @@ -613,6 +639,8 @@ struct device {
> struct iommu_group *iommu_group;
> struct dev_iommu *iommu;
>
> + enum device_site site; /* Device physical location */
> +
> bool offline_disabled:1;
> bool offline:1;
> bool of_node_reused:1;
> @@ -806,6 +834,20 @@ static inline bool dev_has_sync_state(struct device *dev)
> return false;
> }
>
> +static inline int dev_set_site(struct device *dev, enum device_site site)
> +{
> + if (site < SITE_UNKNOWN || site > SITE_EXTERNAL)
> + return -EINVAL;
> +
> + dev->site = site;
> + return 0;
> +}
> +
> +static inline bool dev_is_external(struct device *dev)
> +{
> + return dev->site == SITE_EXTERNAL;
> +}

I'm not CC'ed in the rest of the patches in this series, so just
responding here. I see you use this function in patch 6/7 to decide if
the PCI device is trusted. Anything other than EXTERNAL is being
treated as trusted. I'd argue that anything that's not internal should
be distrusted. For example, I can have a hacked up laptop dock that I
can share with you when you visit my home/office and now you are
trusting it when you shouldn't be.

Also, "UNKNOWN" is treated as trusted in patch 6/7. I'm guessing this
is because some of the devices might not have the info in their
firmware? At which point, this feature isn't even protecting all the
PCI ports properly? This adds to Greg point that this should be a
userspace policy so that it can override whatever is wrong/missing in
the firmware.

-Saravana

2020-07-01 18:07:55

by Rajat Jain

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

Hello,

On Tue, Jun 30, 2020 at 10:00 AM Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Tue, Jun 30, 2020 at 06:08:31PM +0200, Rafael J. Wysocki wrote:
> > On Tue, Jun 30, 2020 at 5:38 PM Greg Kroah-Hartman
> > <[email protected]> wrote:
> > >
> > > On Tue, Jun 30, 2020 at 03:00:34PM +0200, Rafael J. Wysocki wrote:
> > > > On Tue, Jun 30, 2020 at 2:52 PM Greg Kroah-Hartman
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Tue, Jun 30, 2020 at 01:49:48PM +0300, Heikki Krogerus wrote:
> > > > > > On Mon, Jun 29, 2020 at 09:49:41PM -0700, Rajat Jain wrote:
> > > > > > > Add a new (optional) field to denote the physical location of a device
> > > > > > > in the system, and expose it in sysfs. This was discussed here:
> > > > > > > https://lore.kernel.org/linux-acpi/[email protected]/
> > > > > > >
> > > > > > > (The primary choice for attribute name i.e. "location" is already
> > > > > > > exposed as an ABI elsewhere, so settled for "site"). Individual buses
> > > > > > > that want to support this new attribute can opt-in by setting a flag in
> > > > > > > bus_type, and then populating the location of device while enumerating
> > > > > > > it.
> > > > > >
> > > > > > So why not just call it "physical_location"?
> > > > >
> > > > > That's better, and will allow us to put "3rd blue plug from the left,
> > > > > 4th row down" in there someday :)
> > > > >
> > > > > All of this is "relative" to the CPU, right? But what CPU? Again, how
> > > > > are the systems with drawers of PCI and CPUs and memory that can be
> > > > > added/removed at any point in time being handled here? What is
> > > > > "internal" and "external" for them?
> > > > >
> > > > > What exactly is the physical boundry here that is attempting to be
> > > > > described?
> > > >
> > > > Also, where is the "physical location" information going to come from?
> > >
> > > Who knows? :)
> > >
> > > Some BIOS seem to provide this, but do you trust that?
> > >
> > > > If that is the platform firmware (which I suspect is the anticipated
> > > > case), there may be problems with reliability related to that.
> > >
> > > s/may/will/
> > >
> > > which means making the kernel inact a policy like this patch series
> > > tries to add, will result in a lot of broken systems, which is why I
> > > keep saying that it needs to be done in userspace.
> > >
> > > It's as if some of us haven't been down this road before and just keep
> > > being ignored...
> > >
> > > {sigh}
> >
> > Well, to be honest, if you are a "vertical" vendor and you control the
> > entire stack, *including* the platform firmware, it would be kind of
> > OK for you to do that in a product kernel.
> >
> > However, this is not a practical thing to do in the mainline kernel
> > which must work for everybody, including people who happen to use
> > systems with broken or even actively unfriendly firmware on them.
> >
> > So I'm inclined to say that IMO this series "as is" would not be an
> > improvement from the mainline perspective.
>
> It can be, we have been using this for USB devices for many many years
> now, quite successfully. The key is not to trust that the platform
> firmware got it right :)
>
> > I guess it would make sense to have an attribute for user space to
> > write to in order to make the kernel reject device plug-in events
> > coming from a given port or connector, but the kernel has no reliable
> > means to determine *which* ports or connectors are "safe", and even if
> > there was a way for it to do that, it still may not agree with user
> > space on which ports or connectors should be regarded as "safe".
>
> Again, we have been doing this for USB devices for a very long time, PCI
> shouldn't be any different. Why people keep ignoring working solutions
> is beyond me, there's nothing "special" about PCI devices here for this
> type of "worry" or reasoning to try to create new solutions.
>
> So, again, I ask, go do what USB does, and to do that, take the logic
> out of the USB core, make it bus-agnositic, and _THEN_ add it to the PCI
> code. Why the original submitter keeps ignoring my request to do this
> is beyond me, I guess they like making patches that will get rejected :(

IMHO I'm actually trying to precisely do what I think was the
conclusion of our discussion, and then some changes because of the
further feedback I received on those patches. Let's take a step back
and please allow me to explain how I got here (my apologies but this
spans a couple of threads, and I"m trying to tie them all together
here):

GOAL: To allow user space to control what (PCI) drivers he wants to
allow on external (thunderbolt) ports. There was a lot of debate about
the need for such a policy at
https://lore.kernel.org/linux-pci/CACK8Z6GR7-wseug=TtVyRarVZX_ao2geoLDNBwjtB+5Y7VWNEQ@mail.gmail.com/
with the final conclusion that it should be OK to implement such a
policy in userspace, as long as the policy is not implemented in the
kernel. The kernel only needs to expose bits & info that is needed by
the userspace to implement such a policy, and it can be used in
conjunction with "drivers_autoprobe" to implement this policy:
--------------------------------------------------------------------
....
That's an odd thing, but sure, if you want to write up such a policy for
your systems, great. But that policy does not belong in the kernel, it
belongs in userspace.
....
--------------------------------------------------------------------

1) The post https://lore.kernel.org/linux-pci/20200609210400.GA1461839@bjorn-Precision-5520/
lists out the approach that was agreed on. Replicating it here:
-----------------------------------------------------------------------
- Expose the PCI pdev->untrusted bit in sysfs. We don't expose this
today, but doing so would be trivial. I think I would prefer a
sysfs name like "external" so it's more descriptive and less of a
judgment.

This comes from either the DT "external-facing" property or the
ACPI "ExternalFacingPort" property.

- All devices present at boot are enumerated. Any statically built
drivers will bind to them before any userspace code runs.

If you want to keep statically built drivers from binding, you'd
need to invent some mechanism so pci_driver_init() could clear
drivers_autoprobe after registering pci_bus_type.

- Early userspace code prevents modular drivers from automatically
binding to PCI devices:

echo 0 > /sys/bus/pci/drivers_autoprobe

This prevents modular drivers from binding to all devices, whether
present at boot or hot-added.

- Userspace code uses the sysfs "bind" file to control which drivers
are loaded and can bind to each device, e.g.,

echo 0000:02:00.0 > /sys/bus/pci/drivers/nvme/bind
-----------------------------------------------------------------------

2) As part of implementing the above agreed approach, when I exposed
PCI "untrusted" attribute to userspace, it ran into discussion that
concluded that instead of this, the device core should be enhanced
with a location attribute.
https://lore.kernel.org/linux-pci/[email protected]/
-----------------------------------------------------------------------
...
The attribute should be called something like "location" or something
like that (naming is hard), as you don't always know if something is
external or not (it could be internal, it could be unknown, it could be
internal to an external device that you trust (think PCI drawers for
"super" computers that are hot pluggable but yet really part of the
internal bus).
....
"trust" has no direct relation to the location, except in a policy of
what you wish to do with that device, so as long as you keep them
separate that way, I am fine with it.
...
-----------------------------------------------------------------------

And hence this patch. I don't see an attribute in USB comparable to
this new attribute, except for the boolean "removable" may be. Are you
suggesting to pull that into the device core instead of adding this
"physical_location" attribute?

3) The one deviation from the agreed approach in (1) is
https://patchwork.kernel.org/patch/11633095/ . The reason is I
realized that contrary to what I earlier believed, we might not be
able to disable the PCI link to all external PCI devices at boot. So
external PCI devices may actually bind to drivers before userspace
comes up and does "echo 0 > /sys/bus/pci/drivers_autoprobe").

I'm really happy to do what you think is the right way as long as it
helps achieve my goal above. Really looking for clear directions here.

Thanks & Best Regards,

Rajat


> thanks,
>
> greg k-h

2020-07-02 05:26:05

by Oliver O'Halloran

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Thu, Jul 2, 2020 at 4:07 AM Rajat Jain <[email protected]> wrote:
>
> *snip*
>
> > > I guess it would make sense to have an attribute for user space to
> > > write to in order to make the kernel reject device plug-in events
> > > coming from a given port or connector, but the kernel has no reliable
> > > means to determine *which* ports or connectors are "safe", and even if
> > > there was a way for it to do that, it still may not agree with user
> > > space on which ports or connectors should be regarded as "safe".
> >
> > Again, we have been doing this for USB devices for a very long time, PCI
> > shouldn't be any different. Why people keep ignoring working solutions
> > is beyond me, there's nothing "special" about PCI devices here for this
> > type of "worry" or reasoning to try to create new solutions.
> >
> > So, again, I ask, go do what USB does, and to do that, take the logic
> > out of the USB core, make it bus-agnositic, and _THEN_ add it to the PCI
> > code. Why the original submitter keeps ignoring my request to do this
> > is beyond me, I guess they like making patches that will get rejected :(
>
> IMHO I'm actually trying to precisely do what I think was the
> conclusion of our discussion, and then some changes because of the
> further feedback I received on those patches. Let's take a step back
> and please allow me to explain how I got here (my apologies but this
> spans a couple of threads, and I"m trying to tie them all together
> here):

The previous thread had some suggestions, but no real conclusions.
That's probably why we're still arguing about it...

> GOAL: To allow user space to control what (PCI) drivers he wants to
> allow on external (thunderbolt) ports. There was a lot of debate about
> the need for such a policy at
> https://lore.kernel.org/linux-pci/CACK8Z6GR7-wseug=TtVyRarVZX_ao2geoLDNBwjtB+5Y7VWNEQ@mail.gmail.com/
> with the final conclusion that it should be OK to implement such a
> policy in userspace, as long as the policy is not implemented in the
> kernel. The kernel only needs to expose bits & info that is needed by
> the userspace to implement such a policy, and it can be used in
> conjunction with "drivers_autoprobe" to implement this policy:
> --------------------------------------------------------------------
> ....
> That's an odd thing, but sure, if you want to write up such a policy for
> your systems, great. But that policy does not belong in the kernel, it
> belongs in userspace.
> ....
> --------------------------------------------------------------------
> 1) The post https://lore.kernel.org/linux-pci/20200609210400.GA1461839@bjorn-Precision-5520/
> lists out the approach that was agreed on. Replicating it here:
> -----------------------------------------------------------------------
> - Expose the PCI pdev->untrusted bit in sysfs. We don't expose this
> today, but doing so would be trivial. I think I would prefer a
> sysfs name like "external" so it's more descriptive and less of a
> judgment.
>
> This comes from either the DT "external-facing" property or the
> ACPI "ExternalFacingPort" property.
>
> - All devices present at boot are enumerated. Any statically built
> drivers will bind to them before any userspace code runs.
>
> If you want to keep statically built drivers from binding, you'd
> need to invent some mechanism so pci_driver_init() could clear
> drivers_autoprobe after registering pci_bus_type.
>
> - Early userspace code prevents modular drivers from automatically
> binding to PCI devices:
>
> echo 0 > /sys/bus/pci/drivers_autoprobe
>
> This prevents modular drivers from binding to all devices, whether
> present at boot or hot-added.
>
> - Userspace code uses the sysfs "bind" file to control which drivers
> are loaded and can bind to each device, e.g.,
>
> echo 0000:02:00.0 > /sys/bus/pci/drivers/nvme/bind

I think this is a reasonable suggestion. However, as Greg pointed out
it's gratuitously different to what USB does for no real reason.

> -----------------------------------------------------------------------
> 2) As part of implementing the above agreed approach, when I exposed
> PCI "untrusted" attribute to userspace, it ran into discussion that
> concluded that instead of this, the device core should be enhanced
> with a location attribute.
> https://lore.kernel.org/linux-pci/[email protected]/
> -----------------------------------------------------------------------
> ...
> The attribute should be called something like "location" or something
> like that (naming is hard), as you don't always know if something is
> external or not (it could be internal, it could be unknown, it could be
> internal to an external device that you trust (think PCI drawers for
> "super" computers that are hot pluggable but yet really part of the
> internal bus).
> ....
> "trust" has no direct relation to the location, except in a policy of
> what you wish to do with that device, so as long as you keep them
> separate that way, I am fine with it.
> ...
> -----------------------------------------------------------------------
>
> And hence this patch. I don't see an attribute in USB comparable to
> this new attribute, except for the boolean "removable" may be. Are you
> suggesting to pull that into the device core instead of adding this
> "physical_location" attribute?

He's suggesting you pull the "authorized" attribute into the driver
core. That's the mechanism USB uses to block drivers binding unless
userspace authorizes them. I don't see any reason why we can't re-use
that sysfs interface for PCI devices since the problem being solved is
fundamentally the same. The main question is what we should do as a
default policy in the kernel. For USB the default comes from the
"authorized_default" module param of usbcore:

> /* authorized_default behaviour:
> * -1 is authorized for all devices except wireless (old behaviour)
> * 0 is unauthorized for all devices
> * 1 is authorized for all devices
> * 2 is authorized for internal devices
> */
> #define USB_AUTHORIZE_WIRED -1
> #define USB_AUTHORIZE_NONE 0
> #define USB_AUTHORIZE_ALL 1
> #define USB_AUTHORIZE_INTERNAL 2
>
> static int authorized_default = USB_AUTHORIZE_WIRED;
> module_param(authorized_default, int, S_IRUGO|S_IWUSR);

So the default policy for USB is to authorize any wired USB device and
we can optionally restrict that to just integrated devices. Sounding
familiar?

The internal / external status is still useful to know so we might
want to make a sysfs attribute for that too. However, I'd like to
point out that internal / external isn't the whole story. As I
mentioned in the last thread if I have a BMC device I *really* don't
want it to be authorized by default even though it's an internal
device. Similarly, if I know all my internal cards support PCIe
Component Authentication then I might choose not to trust any PCI
devices unless they authenticate successfully.

> 3) The one deviation from the agreed approach in (1) is
> https://patchwork.kernel.org/patch/11633095/ . The reason is I
> realized that contrary to what I earlier believed, we might not be
> able to disable the PCI link to all external PCI devices at boot. So
> external PCI devices may actually bind to drivers before userspace
> comes up and does "echo 0 > /sys/bus/pci/drivers_autoprobe").

Yep, that's a problem. If we want to provide a useful mechanism to
userspace then the default behaviour of the kernel can't undermine
that mechanism. If that means we need another kernel command line
parameter then I guess we just have to live with it.

Oliver

2020-07-02 07:33:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Thu, Jul 02, 2020 at 03:23:23PM +1000, Oliver O'Halloran wrote:
> On Thu, Jul 2, 2020 at 4:07 AM Rajat Jain <[email protected]> wrote:
> >
> > *snip*
> >
> > > > I guess it would make sense to have an attribute for user space to
> > > > write to in order to make the kernel reject device plug-in events
> > > > coming from a given port or connector, but the kernel has no reliable
> > > > means to determine *which* ports or connectors are "safe", and even if
> > > > there was a way for it to do that, it still may not agree with user
> > > > space on which ports or connectors should be regarded as "safe".
> > >
> > > Again, we have been doing this for USB devices for a very long time, PCI
> > > shouldn't be any different. Why people keep ignoring working solutions
> > > is beyond me, there's nothing "special" about PCI devices here for this
> > > type of "worry" or reasoning to try to create new solutions.
> > >
> > > So, again, I ask, go do what USB does, and to do that, take the logic
> > > out of the USB core, make it bus-agnositic, and _THEN_ add it to the PCI
> > > code. Why the original submitter keeps ignoring my request to do this
> > > is beyond me, I guess they like making patches that will get rejected :(
> >
> > IMHO I'm actually trying to precisely do what I think was the
> > conclusion of our discussion, and then some changes because of the
> > further feedback I received on those patches. Let's take a step back
> > and please allow me to explain how I got here (my apologies but this
> > spans a couple of threads, and I"m trying to tie them all together
> > here):
>
> The previous thread had some suggestions, but no real conclusions.
> That's probably why we're still arguing about it...
>
> > GOAL: To allow user space to control what (PCI) drivers he wants to
> > allow on external (thunderbolt) ports. There was a lot of debate about
> > the need for such a policy at
> > https://lore.kernel.org/linux-pci/CACK8Z6GR7-wseug=TtVyRarVZX_ao2geoLDNBwjtB+5Y7VWNEQ@mail.gmail.com/
> > with the final conclusion that it should be OK to implement such a
> > policy in userspace, as long as the policy is not implemented in the
> > kernel. The kernel only needs to expose bits & info that is needed by
> > the userspace to implement such a policy, and it can be used in
> > conjunction with "drivers_autoprobe" to implement this policy:
> > --------------------------------------------------------------------
> > ....
> > That's an odd thing, but sure, if you want to write up such a policy for
> > your systems, great. But that policy does not belong in the kernel, it
> > belongs in userspace.
> > ....
> > --------------------------------------------------------------------
> > 1) The post https://lore.kernel.org/linux-pci/20200609210400.GA1461839@bjorn-Precision-5520/
> > lists out the approach that was agreed on. Replicating it here:
> > -----------------------------------------------------------------------
> > - Expose the PCI pdev->untrusted bit in sysfs. We don't expose this
> > today, but doing so would be trivial. I think I would prefer a
> > sysfs name like "external" so it's more descriptive and less of a
> > judgment.
> >
> > This comes from either the DT "external-facing" property or the
> > ACPI "ExternalFacingPort" property.
> >
> > - All devices present at boot are enumerated. Any statically built
> > drivers will bind to them before any userspace code runs.
> >
> > If you want to keep statically built drivers from binding, you'd
> > need to invent some mechanism so pci_driver_init() could clear
> > drivers_autoprobe after registering pci_bus_type.
> >
> > - Early userspace code prevents modular drivers from automatically
> > binding to PCI devices:
> >
> > echo 0 > /sys/bus/pci/drivers_autoprobe
> >
> > This prevents modular drivers from binding to all devices, whether
> > present at boot or hot-added.
> >
> > - Userspace code uses the sysfs "bind" file to control which drivers
> > are loaded and can bind to each device, e.g.,
> >
> > echo 0000:02:00.0 > /sys/bus/pci/drivers/nvme/bind
>
> I think this is a reasonable suggestion. However, as Greg pointed out
> it's gratuitously different to what USB does for no real reason.

Agreed.

> > -----------------------------------------------------------------------
> > 2) As part of implementing the above agreed approach, when I exposed
> > PCI "untrusted" attribute to userspace, it ran into discussion that
> > concluded that instead of this, the device core should be enhanced
> > with a location attribute.
> > https://lore.kernel.org/linux-pci/[email protected]/
> > -----------------------------------------------------------------------
> > ...
> > The attribute should be called something like "location" or something
> > like that (naming is hard), as you don't always know if something is
> > external or not (it could be internal, it could be unknown, it could be
> > internal to an external device that you trust (think PCI drawers for
> > "super" computers that are hot pluggable but yet really part of the
> > internal bus).
> > ....
> > "trust" has no direct relation to the location, except in a policy of
> > what you wish to do with that device, so as long as you keep them
> > separate that way, I am fine with it.
> > ...
> > -----------------------------------------------------------------------
> >
> > And hence this patch. I don't see an attribute in USB comparable to
> > this new attribute, except for the boolean "removable" may be. Are you
> > suggesting to pull that into the device core instead of adding this
> > "physical_location" attribute?
>
> He's suggesting you pull the "authorized" attribute into the driver
> core. That's the mechanism USB uses to block drivers binding unless
> userspace authorizes them. I don't see any reason why we can't re-use
> that sysfs interface for PCI devices since the problem being solved is
> fundamentally the same. The main question is what we should do as a
> default policy in the kernel. For USB the default comes from the
> "authorized_default" module param of usbcore:
>
> > /* authorized_default behaviour:
> > * -1 is authorized for all devices except wireless (old behaviour)
> > * 0 is unauthorized for all devices
> > * 1 is authorized for all devices
> > * 2 is authorized for internal devices
> > */
> > #define USB_AUTHORIZE_WIRED -1
> > #define USB_AUTHORIZE_NONE 0
> > #define USB_AUTHORIZE_ALL 1
> > #define USB_AUTHORIZE_INTERNAL 2
> >
> > static int authorized_default = USB_AUTHORIZE_WIRED;
> > module_param(authorized_default, int, S_IRUGO|S_IWUSR);
>
> So the default policy for USB is to authorize any wired USB device and
> we can optionally restrict that to just integrated devices. Sounding
> familiar?

Thank you, that is what I have been trying to get across here, obviously
I didn't do a good job. :)

Thanks for the summary.

> The internal / external status is still useful to know so we might
> want to make a sysfs attribute for that too. However, I'd like to
> point out that internal / external isn't the whole story. As I
> mentioned in the last thread if I have a BMC device I *really* don't
> want it to be authorized by default even though it's an internal
> device. Similarly, if I know all my internal cards support PCIe
> Component Authentication then I might choose not to trust any PCI
> devices unless they authenticate successfully.

Agreed.

> > 3) The one deviation from the agreed approach in (1) is
> > https://patchwork.kernel.org/patch/11633095/ . The reason is I
> > realized that contrary to what I earlier believed, we might not be
> > able to disable the PCI link to all external PCI devices at boot. So
> > external PCI devices may actually bind to drivers before userspace
> > comes up and does "echo 0 > /sys/bus/pci/drivers_autoprobe").
>
> Yep, that's a problem. If we want to provide a useful mechanism to
> userspace then the default behaviour of the kernel can't undermine
> that mechanism. If that means we need another kernel command line
> parameter then I guess we just have to live with it.

I really do not want yet-another-kernel-command-line-option if we can
help it at all. Sane defaults are the best thing to do here. Userspace
comes up really early, put your policy in there, not in blobs passed
from your bootloader.

thanks,

greg k-h

2020-07-02 08:41:11

by Oliver O'Halloran

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Thu, 2020-07-02 at 09:32 +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 02, 2020 at 03:23:23PM +1000, Oliver O'Halloran wrote:
> > Yep, that's a problem. If we want to provide a useful mechanism to
> > userspace then the default behaviour of the kernel can't undermine
> > that mechanism. If that means we need another kernel command line
> > parameter then I guess we just have to live with it.
>
> I really do not want yet-another-kernel-command-line-option if we can
> help it at all. Sane defaults are the best thing to do here. Userspace
> comes up really early, put your policy in there, not in blobs passed
> from your bootloader.

Userspace comes up early, but builtin drivers will bind before init is
started. e.g.

# dmesg | egrep '0002:01:00.0|/init'
[ 0.976800][ T1] pci 0002:01:00.0: [8086:1589] type 00 class 0x020000
[ 0.976923][ T1] pci 0002:01:00.0: reg 0x10: [mem 0x220000000000-0x2200007fffff 64bit pref]
[ 0.977004][ T1] pci 0002:01:00.0: reg 0x1c: [mem 0x220002000000-0x220002007fff 64bit pref]
[ 0.977068][ T1] pci 0002:01:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref]
[ 0.977122][ T1] pci 0002:01:00.0: BAR3 [mem size 0x00008000 64bit pref]: requesting alignment to 0x10000
[ 0.977401][ T1] pci 0002:01:00.0: PME# supported from D0 D3hot
[ 1.011929][ T1] pci 0002:01:00.0: BAR 0: assigned [mem 0x220000000000-0x2200007fffff 64bit pref]
[ 1.012085][ T1] pci 0002:01:00.0: BAR 6: assigned [mem 0x3fe100000000-0x3fe10007ffff pref]
[ 1.012127][ T1] pci 0002:01:00.0: BAR 3: assigned [mem 0x220002000000-0x220002007fff 64bit pref]
[ 4.399588][ T12] i40e 0002:01:00.0: enabling device (0140 -> 0142)
[ 4.410891][ T12] i40e 0002:01:00.0: fw 5.1.40981 api 1.5 nvm 5.03 0x80002469 1.1313.0 [8086:1589] [15d9:0000]
[ 4.647524][ T12] i40e 0002:01:00.0: MAC address: 0c:c4:7a:b7:fc:74
[ 4.647685][ T12] i40e 0002:01:00.0: FW LLDP is enabled
[ 4.653918][ T12] i40e 0002:01:00.0 eth0: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
[ 4.655552][ T12] i40e 0002:01:00.0: PCI-Express: Speed 8.0GT/s Width x8
[ 4.656071][ T12] i40e 0002:01:00.0: Features: PF-id[0] VSIs: 34 QP: 80 RSS FD_ATR FD_SB NTUPLE VxLAN Geneve PTP VEPA
[ 13.803709][ T1] Run /init as init process
[ 13.963242][ T711] i40e 0002:01:00.0 enP2p1s0f0: renamed from eth0

Building everything into the kernel is admittedly pretty niche. I only
do it to avoid re-building the initramfs for my test kernels. It does
seem relatively common on embedded systems, but I'm not sure how many
of those care about PCIe. It would be nice to provide *something* to
cover that case for the people who care.

Oliver


2020-07-02 08:53:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Thu, Jul 02, 2020 at 06:40:09PM +1000, Oliver O'Halloran wrote:
> On Thu, 2020-07-02 at 09:32 +0200, Greg Kroah-Hartman wrote:
> > On Thu, Jul 02, 2020 at 03:23:23PM +1000, Oliver O'Halloran wrote:
> > > Yep, that's a problem. If we want to provide a useful mechanism to
> > > userspace then the default behaviour of the kernel can't undermine
> > > that mechanism. If that means we need another kernel command line
> > > parameter then I guess we just have to live with it.
> >
> > I really do not want yet-another-kernel-command-line-option if we can
> > help it at all. Sane defaults are the best thing to do here. Userspace
> > comes up really early, put your policy in there, not in blobs passed
> > from your bootloader.
>
> Userspace comes up early, but builtin drivers will bind before init is
> started. e.g.
>
> # dmesg | egrep '0002:01:00.0|/init'
> [ 0.976800][ T1] pci 0002:01:00.0: [8086:1589] type 00 class 0x020000
> [ 0.976923][ T1] pci 0002:01:00.0: reg 0x10: [mem 0x220000000000-0x2200007fffff 64bit pref]
> [ 0.977004][ T1] pci 0002:01:00.0: reg 0x1c: [mem 0x220002000000-0x220002007fff 64bit pref]
> [ 0.977068][ T1] pci 0002:01:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref]
> [ 0.977122][ T1] pci 0002:01:00.0: BAR3 [mem size 0x00008000 64bit pref]: requesting alignment to 0x10000
> [ 0.977401][ T1] pci 0002:01:00.0: PME# supported from D0 D3hot
> [ 1.011929][ T1] pci 0002:01:00.0: BAR 0: assigned [mem 0x220000000000-0x2200007fffff 64bit pref]
> [ 1.012085][ T1] pci 0002:01:00.0: BAR 6: assigned [mem 0x3fe100000000-0x3fe10007ffff pref]
> [ 1.012127][ T1] pci 0002:01:00.0: BAR 3: assigned [mem 0x220002000000-0x220002007fff 64bit pref]
> [ 4.399588][ T12] i40e 0002:01:00.0: enabling device (0140 -> 0142)
> [ 4.410891][ T12] i40e 0002:01:00.0: fw 5.1.40981 api 1.5 nvm 5.03 0x80002469 1.1313.0 [8086:1589] [15d9:0000]
> [ 4.647524][ T12] i40e 0002:01:00.0: MAC address: 0c:c4:7a:b7:fc:74
> [ 4.647685][ T12] i40e 0002:01:00.0: FW LLDP is enabled
> [ 4.653918][ T12] i40e 0002:01:00.0 eth0: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
> [ 4.655552][ T12] i40e 0002:01:00.0: PCI-Express: Speed 8.0GT/s Width x8
> [ 4.656071][ T12] i40e 0002:01:00.0: Features: PF-id[0] VSIs: 34 QP: 80 RSS FD_ATR FD_SB NTUPLE VxLAN Geneve PTP VEPA
> [ 13.803709][ T1] Run /init as init process
> [ 13.963242][ T711] i40e 0002:01:00.0 enP2p1s0f0: renamed from eth0
>
> Building everything into the kernel is admittedly pretty niche. I only
> do it to avoid re-building the initramfs for my test kernels. It does
> seem relatively common on embedded systems, but I'm not sure how many
> of those care about PCIe. It would be nice to provide *something* to
> cover that case for the people who care.

Those people who care should not build those drivers into their kernel :)

2020-07-02 08:56:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Thu, Jul 02, 2020 at 10:52:12AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 02, 2020 at 06:40:09PM +1000, Oliver O'Halloran wrote:
> > On Thu, 2020-07-02 at 09:32 +0200, Greg Kroah-Hartman wrote:
> > > On Thu, Jul 02, 2020 at 03:23:23PM +1000, Oliver O'Halloran wrote:
> > > > Yep, that's a problem. If we want to provide a useful mechanism to
> > > > userspace then the default behaviour of the kernel can't undermine
> > > > that mechanism. If that means we need another kernel command line
> > > > parameter then I guess we just have to live with it.
> > >
> > > I really do not want yet-another-kernel-command-line-option if we can
> > > help it at all. Sane defaults are the best thing to do here. Userspace
> > > comes up really early, put your policy in there, not in blobs passed
> > > from your bootloader.
> >
> > Userspace comes up early, but builtin drivers will bind before init is
> > started. e.g.
> >
> > # dmesg | egrep '0002:01:00.0|/init'
> > [ 0.976800][ T1] pci 0002:01:00.0: [8086:1589] type 00 class 0x020000
> > [ 0.976923][ T1] pci 0002:01:00.0: reg 0x10: [mem 0x220000000000-0x2200007fffff 64bit pref]
> > [ 0.977004][ T1] pci 0002:01:00.0: reg 0x1c: [mem 0x220002000000-0x220002007fff 64bit pref]
> > [ 0.977068][ T1] pci 0002:01:00.0: reg 0x30: [mem 0x00000000-0x0007ffff pref]
> > [ 0.977122][ T1] pci 0002:01:00.0: BAR3 [mem size 0x00008000 64bit pref]: requesting alignment to 0x10000
> > [ 0.977401][ T1] pci 0002:01:00.0: PME# supported from D0 D3hot
> > [ 1.011929][ T1] pci 0002:01:00.0: BAR 0: assigned [mem 0x220000000000-0x2200007fffff 64bit pref]
> > [ 1.012085][ T1] pci 0002:01:00.0: BAR 6: assigned [mem 0x3fe100000000-0x3fe10007ffff pref]
> > [ 1.012127][ T1] pci 0002:01:00.0: BAR 3: assigned [mem 0x220002000000-0x220002007fff 64bit pref]
> > [ 4.399588][ T12] i40e 0002:01:00.0: enabling device (0140 -> 0142)
> > [ 4.410891][ T12] i40e 0002:01:00.0: fw 5.1.40981 api 1.5 nvm 5.03 0x80002469 1.1313.0 [8086:1589] [15d9:0000]
> > [ 4.647524][ T12] i40e 0002:01:00.0: MAC address: 0c:c4:7a:b7:fc:74
> > [ 4.647685][ T12] i40e 0002:01:00.0: FW LLDP is enabled
> > [ 4.653918][ T12] i40e 0002:01:00.0 eth0: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
> > [ 4.655552][ T12] i40e 0002:01:00.0: PCI-Express: Speed 8.0GT/s Width x8
> > [ 4.656071][ T12] i40e 0002:01:00.0: Features: PF-id[0] VSIs: 34 QP: 80 RSS FD_ATR FD_SB NTUPLE VxLAN Geneve PTP VEPA
> > [ 13.803709][ T1] Run /init as init process
> > [ 13.963242][ T711] i40e 0002:01:00.0 enP2p1s0f0: renamed from eth0
> >
> > Building everything into the kernel is admittedly pretty niche. I only
> > do it to avoid re-building the initramfs for my test kernels. It does
> > seem relatively common on embedded systems, but I'm not sure how many
> > of those care about PCIe. It would be nice to provide *something* to
> > cover that case for the people who care.
>
> Those people who care should not build those drivers into their kernel :)

That being said, that is the _last_ thing to worry about in this type of
patchset, lots of work needs to be done before we can care about this.
In fact, that should just be a totally separate patch after all of the
real work is done here first.

thanks,

greg k-h

2020-07-04 11:47:29

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] Tighten PCI security, expose dev location in sysfs

Hi!

> * The first 3 patches tighten the PCI security using ACS, and take care
> of a border case.
> * The 4th patch takes care of PCI bug.
> * 5th and 6th patches expose a device's location into the sysfs to allow
> admin to make decision based on that.

I see no patch for Documentation -- new sysfs interfaces should be
documented for 5/6.

Pavel

> drivers/base/core.c | 35 +++++++++++++++++++++++++++++++
> drivers/iommu/intel/iommu.c | 31 ++++++++++++++++++---------
> drivers/pci/ats.c | 2 +-
> drivers/pci/bus.c | 13 ++++++------
> drivers/pci/of.c | 2 +-
> drivers/pci/p2pdma.c | 2 +-
> drivers/pci/pci-acpi.c | 13 ++++++------
> drivers/pci/pci-driver.c | 1 +
> drivers/pci/pci.c | 34 ++++++++++++++++++++++++++----
> drivers/pci/pci.h | 3 ++-
> drivers/pci/probe.c | 20 +++++++++++-------
> drivers/pci/quirks.c | 19 +++++++++++++----
> include/linux/device.h | 42 +++++++++++++++++++++++++++++++++++++
> include/linux/device/bus.h | 8 +++++++
> include/linux/pci.h | 13 ++++++------
> 15 files changed, 191 insertions(+), 47 deletions(-)
>

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.33 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2020-07-06 16:40:43

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] PCI: Set "untrusted" flag for truly external devices only

On Mon, Jun 29, 2020 at 09:49:38PM -0700, Rajat Jain wrote:
> The "ExternalFacing" devices (root ports) are still internal devices that
> sit on the internal system fabric and thus trusted. Currently they were
> being marked untrusted.
>
> This patch uses the platform flag to identify the external facing devices
> and then use it to mark any downstream devices as "untrusted". The
> external-facing devices themselves are left as "trusted". This was
> discussed here: https://lkml.org/lkml/2020/6/10/1049

Use the imperative mood in the commit log, as you did for 1/7. E.g.,
instead of "This patch uses ...", say "Use the platform flag ...".
That helps all the commit logs read nicely together.

I think this patch makes two changes that should be separated:

- Treat "external-facing" devices as internal.

- Look for the "external-facing" or "ExternalFacing" property on
Switch Downstream Ports as well as Root Ports.

> Signed-off-by: Rajat Jain <[email protected]>
> ---
> v2: cosmetic changes in commit log
>
> drivers/iommu/intel/iommu.c | 2 +-
> drivers/pci/of.c | 2 +-
> drivers/pci/pci-acpi.c | 13 +++++++------
> drivers/pci/probe.c | 2 +-
> include/linux/pci.h | 8 ++++++++
> 5 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index d759e7234e982..1ccb224f82496 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4743,7 +4743,7 @@ static inline bool has_untrusted_dev(void)
> struct pci_dev *pdev = NULL;
>
> for_each_pci_dev(pdev)
> - if (pdev->untrusted)
> + if (pdev->untrusted || pdev->external_facing)

I think checking pdev->external_facing is enough for this case,
because it's impossible to have pdev->untrusted unless a parent has
pdev->external_facing.

IIUC, this usage is asking "might we ever have an external device?"
as opposed to the "pdev->untrusted" uses, which are asking "is *this*
device an external device?"

> return true;
>
> return false;
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 27839cd2459f6..22727fc9558df 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -42,7 +42,7 @@ void pci_set_bus_of_node(struct pci_bus *bus)
> } else {
> node = of_node_get(bus->self->dev.of_node);
> if (node && of_property_read_bool(node, "external-facing"))
> - bus->self->untrusted = true;
> + bus->self->external_facing = true;
> }
>
> bus->dev.of_node = node;
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 7224b1e5f2a83..492c07805caf8 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -1213,22 +1213,23 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev,
> ACPI_FREE(obj);
> }
>
> -static void pci_acpi_set_untrusted(struct pci_dev *dev)
> +static void pci_acpi_set_external_facing(struct pci_dev *dev)
> {
> u8 val;
>
> - if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT)
> + if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT &&
> + pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM)

This looks like a change worthy of its own patch. We used to look for
"ExternalFacingPort" only on Root Ports; now we'll also do it for
Switch Downstream Ports.

Can you include DT and ACPI spec references if they exist? I found
this mention:
https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports
which actually says it should only be implemented for Root Ports.

It also mentions a "DmaProperty" that looks related. Maybe Linux
should also pay attention to this?

If we do change this, should we use pcie_downstream_port(), which
includes PCI-to-PCIe bridges as well?

> return;
> if (device_property_read_u8(&dev->dev, "ExternalFacingPort", &val))
> return;
>
> /*
> - * These root ports expose PCIe (including DMA) outside of the
> - * system so make sure we treat them and everything behind as
> + * These root/down ports expose PCIe (including DMA) outside of the
> + * system so make sure we treat everything behind them as
> * untrusted.
> */
> if (val)
> - dev->untrusted = 1;
> + dev->external_facing = 1;
> }
>
> static void pci_acpi_setup(struct device *dev)
> @@ -1240,7 +1241,7 @@ static void pci_acpi_setup(struct device *dev)
> return;
>
> pci_acpi_optimize_delay(pci_dev, adev->handle);
> - pci_acpi_set_untrusted(pci_dev);
> + pci_acpi_set_external_facing(pci_dev);
> pci_acpi_add_edr_notifier(pci_dev);
>
> pci_acpi_add_pm_notifier(adev, pci_dev);
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 6d87066a5ecc5..8c40c00413e74 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1552,7 +1552,7 @@ static void set_pcie_untrusted(struct pci_dev *dev)
> * untrusted as well.
> */
> parent = pci_upstream_bridge(dev);
> - if (parent && parent->untrusted)
> + if (parent && (parent->untrusted || parent->external_facing))
> dev->untrusted = true;
> }
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index a26be5332bba6..fe1bc603fda40 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -432,6 +432,14 @@ struct pci_dev {
> * mappings to make sure they cannot access arbitrary memory.
> */
> unsigned int untrusted:1;
> + /*
> + * Devices are marked as external-facing using info from platform
> + * (ACPI / devicetree). An external-facing device is still an internal
> + * trusted device, but it faces external untrusted devices. Thus any
> + * devices enumerated downstream an external-facing device is marked
> + * as untrusted.

This comment has a subject/verb agreement problem.

> + */
> + unsigned int external_facing:1;
> unsigned int broken_intx_masking:1; /* INTx masking can't be used */
> unsigned int io_window_1k:1; /* Intel bridge 1K I/O windows */
> unsigned int irq_managed:1;
> --
> 2.27.0.212.ge8ba1cc988-goog
>

2020-07-06 16:44:07

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] PCI: Set "untrusted" flag for truly external devices only

On Tue, Jun 30, 2020 at 09:55:54AM +0200, Greg Kroah-Hartman wrote:
> On Mon, Jun 29, 2020 at 09:49:38PM -0700, Rajat Jain wrote:
> > The "ExternalFacing" devices (root ports) are still internal devices that
> > sit on the internal system fabric and thus trusted. Currently they were
> > being marked untrusted.
> >
> > This patch uses the platform flag to identify the external facing devices
> > and then use it to mark any downstream devices as "untrusted". The
> > external-facing devices themselves are left as "trusted". This was
> > discussed here: https://lkml.org/lkml/2020/6/10/1049
>
> {sigh}
>
> First off, please use lore.kernel.org links, we don't control lkml.org
> and it often times has been down.
>
> Also, you need to put all of the information in the changelog, referring
> to another place isn't always the best thing, considering you will be
> looking this up in 20+ years to try to figure out why people came up
> with such a crazy design.
>
> But, the main point is, no, we did not decide on this. "trust" is a
> policy decision to make by userspace, it is independant of "location",
> while you are tieing it directly here, which is what I explicitly said
> NOT to do.
>
> So again, no, I will NAK this patch as-is, sorry, you are mixing things
> together in a way that it should not do at this point in time.

What do you see being mixed together here? I acknowledge that the
name of "pdev->untrusted" is probably a mistake. But this patch
doesn't change anything there. It only changes the treatment of the
edge case of the "ExternalFacing" ports. Previously we treated them
as being external themselves, which does seem wrong.

2020-07-06 16:47:49

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] PCI/ACS: Enable PCI_ACS_TB for untrusted/external-facing devices

On Mon, Jun 29, 2020 at 09:49:39PM -0700, Rajat Jain wrote:
> When enabling ACS, enable translation blocking for external facing ports
> and untrusted devices.
>
> Signed-off-by: Rajat Jain <[email protected]>
> ---
> v2: Commit log change
>
> drivers/pci/pci.c | 4 ++++
> drivers/pci/quirks.c | 11 +++++++++++
> 2 files changed, 15 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index d2ff987585855..79853b52658a2 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3330,6 +3330,10 @@ static void pci_std_enable_acs(struct pci_dev *dev)
> /* Upstream Forwarding */
> ctrl |= (cap & PCI_ACS_UF);
>
> + if (dev->external_facing || dev->untrusted)
> + /* Translation Blocking */
> + ctrl |= (cap & PCI_ACS_TB);
> +
> pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
> }
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index b341628e47527..6294adeac4049 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4934,6 +4934,13 @@ static void pci_quirk_enable_intel_rp_mpc_acs(struct pci_dev *dev)
> }
> }
>
> +/*
> + * Currently this quirk does the equivalent of
> + * PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV
> + *
> + * Currently missing, it also needs to do equivalent of PCI_ACS_TB,
> + * if dev->external_facing || dev->untrusted

I don't understand this comment. Is this a "TODO"? Is there
something more that needs to be done here?

After a patch is applied, a comment should describe the code as it is.

> + */
> static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
> {
> if (!pci_quirk_intel_pch_acs_match(dev))
> @@ -4973,6 +4980,10 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
> ctrl |= (cap & PCI_ACS_CR);
> ctrl |= (cap & PCI_ACS_UF);
>
> + if (dev->external_facing || dev->untrusted)
> + /* Translation Blocking */
> + ctrl |= (cap & PCI_ACS_TB);
> +
> pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl);
>
> pci_info(dev, "Intel SPT PCH root port ACS workaround enabled\n");
> --
> 2.27.0.212.ge8ba1cc988-goog
>

2020-07-06 17:09:26

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] PCI/ACS: Enable PCI_ACS_TB for untrusted/external-facing devices

On Mon, Jun 29, 2020 at 09:49:39PM -0700, Rajat Jain wrote:
> When enabling ACS, enable translation blocking for external facing ports
> and untrusted devices.
>
> Signed-off-by: Rajat Jain <[email protected]>
> ---
> v2: Commit log change
>
> drivers/pci/pci.c | 4 ++++
> drivers/pci/quirks.c | 11 +++++++++++
> 2 files changed, 15 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index d2ff987585855..79853b52658a2 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3330,6 +3330,10 @@ static void pci_std_enable_acs(struct pci_dev *dev)
> /* Upstream Forwarding */
> ctrl |= (cap & PCI_ACS_UF);
>
> + if (dev->external_facing || dev->untrusted)
> + /* Translation Blocking */
> + ctrl |= (cap & PCI_ACS_TB);
> +
> pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
> }
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index b341628e47527..6294adeac4049 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4934,6 +4934,13 @@ static void pci_quirk_enable_intel_rp_mpc_acs(struct pci_dev *dev)
> }
> }
>
> +/*
> + * Currently this quirk does the equivalent of
> + * PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV

Nit: Reorder these as in c8de8ed2dcaa ("PCI: Make ACS quirk
implementations more uniform") so they match other similar lists in
the code.

But more to the point: we have a bunch of other quirks for devices
that do not have an ACS capability but *do* provide some ACS-like
features. Most of them support

PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF

because that's what we usually want. But I bet some of them also
actually provide the equivalent of PCI_ACS_TB.

REQ_ACS_FLAGS doesn't include PCI_ACS_TB. Is there anything we need
to do on the pci_acs_enabled() side to check for PCI_ACS_TB, and
consequently, to update any of the quirks for devices that provide it?

> + *
> + * Currently missing, it also needs to do equivalent of PCI_ACS_TB,
> + * if dev->external_facing || dev->untrusted
> + */
> static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
> {
> if (!pci_quirk_intel_pch_acs_match(dev))
> @@ -4973,6 +4980,10 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
> ctrl |= (cap & PCI_ACS_CR);
> ctrl |= (cap & PCI_ACS_UF);
>
> + if (dev->external_facing || dev->untrusted)
> + /* Translation Blocking */
> + ctrl |= (cap & PCI_ACS_TB);
> +
> pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl);
>
> pci_info(dev, "Intel SPT PCH root port ACS workaround enabled\n");
> --
> 2.27.0.212.ge8ba1cc988-goog
>

2020-07-06 18:51:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] PCI: Set "untrusted" flag for truly external devices only

On Mon, Jul 06, 2020 at 11:41:26AM -0500, Bjorn Helgaas wrote:
> On Tue, Jun 30, 2020 at 09:55:54AM +0200, Greg Kroah-Hartman wrote:
> > On Mon, Jun 29, 2020 at 09:49:38PM -0700, Rajat Jain wrote:
> > > The "ExternalFacing" devices (root ports) are still internal devices that
> > > sit on the internal system fabric and thus trusted. Currently they were
> > > being marked untrusted.
> > >
> > > This patch uses the platform flag to identify the external facing devices
> > > and then use it to mark any downstream devices as "untrusted". The
> > > external-facing devices themselves are left as "trusted". This was
> > > discussed here: https://lkml.org/lkml/2020/6/10/1049
> >
> > {sigh}
> >
> > First off, please use lore.kernel.org links, we don't control lkml.org
> > and it often times has been down.
> >
> > Also, you need to put all of the information in the changelog, referring
> > to another place isn't always the best thing, considering you will be
> > looking this up in 20+ years to try to figure out why people came up
> > with such a crazy design.
> >
> > But, the main point is, no, we did not decide on this. "trust" is a
> > policy decision to make by userspace, it is independant of "location",
> > while you are tieing it directly here, which is what I explicitly said
> > NOT to do.
> >
> > So again, no, I will NAK this patch as-is, sorry, you are mixing things
> > together in a way that it should not do at this point in time.
>
> What do you see being mixed together here? I acknowledge that the
> name of "pdev->untrusted" is probably a mistake. But this patch
> doesn't change anything there. It only changes the treatment of the
> edge case of the "ExternalFacing" ports. Previously we treated them
> as being external themselves, which does seem wrong.

I don't see the patch here, and it's been a while but I think there is a
mixture of "location" and "trust" happening here with a single value
when they should be separate.

Hopefully the next round of this patch series will be better.

thanks,

greg k-h

2020-07-06 22:20:03

by Rajat Jain

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] Tighten PCI security, expose dev location in sysfs

On Sat, Jul 4, 2020 at 4:44 AM Pavel Machek <[email protected]> wrote:
>
> Hi!
>
> > * The first 3 patches tighten the PCI security using ACS, and take care
> > of a border case.
> > * The 4th patch takes care of PCI bug.
> > * 5th and 6th patches expose a device's location into the sysfs to allow
> > admin to make decision based on that.
>
> I see no patch for Documentation -- new sysfs interfaces should be
> documented for 5/6.

Yes, sorry. The patches 5/6 have run into discussion and it looks are
not acceptable at the moment.

Thanks,

Rajat

>
> Pavel
>
> > drivers/base/core.c | 35 +++++++++++++++++++++++++++++++
> > drivers/iommu/intel/iommu.c | 31 ++++++++++++++++++---------
> > drivers/pci/ats.c | 2 +-
> > drivers/pci/bus.c | 13 ++++++------
> > drivers/pci/of.c | 2 +-
> > drivers/pci/p2pdma.c | 2 +-
> > drivers/pci/pci-acpi.c | 13 ++++++------
> > drivers/pci/pci-driver.c | 1 +
> > drivers/pci/pci.c | 34 ++++++++++++++++++++++++++----
> > drivers/pci/pci.h | 3 ++-
> > drivers/pci/probe.c | 20 +++++++++++-------
> > drivers/pci/quirks.c | 19 +++++++++++++----
> > include/linux/device.h | 42 +++++++++++++++++++++++++++++++++++++
> > include/linux/device/bus.h | 8 +++++++
> > include/linux/pci.h | 13 ++++++------
> > 15 files changed, 191 insertions(+), 47 deletions(-)
> >
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2020-07-06 22:35:17

by Rajat Jain

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] PCI: Set "untrusted" flag for truly external devices only

Hello,

On Mon, Jul 6, 2020 at 9:38 AM Bjorn Helgaas <[email protected]> wrote:
>
> On Mon, Jun 29, 2020 at 09:49:38PM -0700, Rajat Jain wrote:
> > The "ExternalFacing" devices (root ports) are still internal devices that
> > sit on the internal system fabric and thus trusted. Currently they were
> > being marked untrusted.
> >
> > This patch uses the platform flag to identify the external facing devices
> > and then use it to mark any downstream devices as "untrusted". The
> > external-facing devices themselves are left as "trusted". This was
> > discussed here: https://lkml.org/lkml/2020/6/10/1049
>
> Use the imperative mood in the commit log, as you did for 1/7. E.g.,
> instead of "This patch uses ...", say "Use the platform flag ...".
> That helps all the commit logs read nicely together.
>
> I think this patch makes two changes that should be separated:
>
> - Treat "external-facing" devices as internal.
>
> - Look for the "external-facing" or "ExternalFacing" property on
> Switch Downstream Ports as well as Root Ports.
>
> > Signed-off-by: Rajat Jain <[email protected]>
> > ---
> > v2: cosmetic changes in commit log
> >
> > drivers/iommu/intel/iommu.c | 2 +-
> > drivers/pci/of.c | 2 +-
> > drivers/pci/pci-acpi.c | 13 +++++++------
> > drivers/pci/probe.c | 2 +-
> > include/linux/pci.h | 8 ++++++++
> > 5 files changed, 18 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index d759e7234e982..1ccb224f82496 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -4743,7 +4743,7 @@ static inline bool has_untrusted_dev(void)
> > struct pci_dev *pdev = NULL;
> >
> > for_each_pci_dev(pdev)
> > - if (pdev->untrusted)
> > + if (pdev->untrusted || pdev->external_facing)
>
> I think checking pdev->external_facing is enough for this case,
> because it's impossible to have pdev->untrusted unless a parent has
> pdev->external_facing.

Agree.

>
> IIUC, this usage is asking "might we ever have an external device?"
> as opposed to the "pdev->untrusted" uses, which are asking "is *this*
> device an external device?"

Agree.

>
> > return true;
> >
> > return false;
> > diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> > index 27839cd2459f6..22727fc9558df 100644
> > --- a/drivers/pci/of.c
> > +++ b/drivers/pci/of.c
> > @@ -42,7 +42,7 @@ void pci_set_bus_of_node(struct pci_bus *bus)
> > } else {
> > node = of_node_get(bus->self->dev.of_node);
> > if (node && of_property_read_bool(node, "external-facing"))
> > - bus->self->untrusted = true;
> > + bus->self->external_facing = true;
> > }
> >
> > bus->dev.of_node = node;
> > diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> > index 7224b1e5f2a83..492c07805caf8 100644
> > --- a/drivers/pci/pci-acpi.c
> > +++ b/drivers/pci/pci-acpi.c
> > @@ -1213,22 +1213,23 @@ static void pci_acpi_optimize_delay(struct pci_dev *pdev,
> > ACPI_FREE(obj);
> > }
> >
> > -static void pci_acpi_set_untrusted(struct pci_dev *dev)
> > +static void pci_acpi_set_external_facing(struct pci_dev *dev)
> > {
> > u8 val;
> >
> > - if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT)
> > + if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT &&
> > + pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM)
>
> This looks like a change worthy of its own patch. We used to look for
> "ExternalFacingPort" only on Root Ports; now we'll also do it for
> Switch Downstream Ports.

Can do. (please see below)

>
> Can you include DT and ACPI spec references if they exist? I found
> this mention:
> https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports
> which actually says it should only be implemented for Root Ports.

I actually have no references. It seems to me that the microsoft spec
assumes that all external ports must be implemented on root ports, but
I think it would be equally fair for systems with PCIe switches to
implement one on one of their switch downstream ports. I don't have an
immediate use of this anyway, so if you think this should rather wait
unless someone really has this case, this can wait. Let me know.

>
> It also mentions a "DmaProperty" that looks related. Maybe Linux
> should also pay attention to this?

Interesting. Since this is not in use currently by the kernel as well
as not exposed by (our) BIOS, I don't have an immediate use case for
this. I'd like to defer this for later (as-the-need-arises).

>
> If we do change this, should we use pcie_downstream_port(), which
> includes PCI-to-PCIe bridges as well?

Sure, can do that.

>
> > return;
> > if (device_property_read_u8(&dev->dev, "ExternalFacingPort", &val))
> > return;
> >
> > /*
> > - * These root ports expose PCIe (including DMA) outside of the
> > - * system so make sure we treat them and everything behind as
> > + * These root/down ports expose PCIe (including DMA) outside of the
> > + * system so make sure we treat everything behind them as
> > * untrusted.
> > */
> > if (val)
> > - dev->untrusted = 1;
> > + dev->external_facing = 1;
> > }
> >
> > static void pci_acpi_setup(struct device *dev)
> > @@ -1240,7 +1241,7 @@ static void pci_acpi_setup(struct device *dev)
> > return;
> >
> > pci_acpi_optimize_delay(pci_dev, adev->handle);
> > - pci_acpi_set_untrusted(pci_dev);
> > + pci_acpi_set_external_facing(pci_dev);
> > pci_acpi_add_edr_notifier(pci_dev);
> >
> > pci_acpi_add_pm_notifier(adev, pci_dev);
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index 6d87066a5ecc5..8c40c00413e74 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1552,7 +1552,7 @@ static void set_pcie_untrusted(struct pci_dev *dev)
> > * untrusted as well.
> > */
> > parent = pci_upstream_bridge(dev);
> > - if (parent && parent->untrusted)
> > + if (parent && (parent->untrusted || parent->external_facing))
> > dev->untrusted = true;
> > }
> >
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index a26be5332bba6..fe1bc603fda40 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -432,6 +432,14 @@ struct pci_dev {
> > * mappings to make sure they cannot access arbitrary memory.
> > */
> > unsigned int untrusted:1;
> > + /*
> > + * Devices are marked as external-facing using info from platform
> > + * (ACPI / devicetree). An external-facing device is still an internal
> > + * trusted device, but it faces external untrusted devices. Thus any
> > + * devices enumerated downstream an external-facing device is marked
> > + * as untrusted.
>
> This comment has a subject/verb agreement problem.

I assume you meant s/is/are/ in last sentence. Will do.

Thanks,

Rajat


>
> > + */
> > + unsigned int external_facing:1;
> > unsigned int broken_intx_masking:1; /* INTx masking can't be used */
> > unsigned int io_window_1k:1; /* Intel bridge 1K I/O windows */
> > unsigned int irq_managed:1;
> > --
> > 2.27.0.212.ge8ba1cc988-goog
> >

2020-07-06 23:16:56

by Rajat Jain

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] PCI/ACS: Enable PCI_ACS_TB for untrusted/external-facing devices

On Mon, Jul 6, 2020 at 9:45 AM Bjorn Helgaas <[email protected]> wrote:
>
> On Mon, Jun 29, 2020 at 09:49:39PM -0700, Rajat Jain wrote:
> > When enabling ACS, enable translation blocking for external facing ports
> > and untrusted devices.
> >
> > Signed-off-by: Rajat Jain <[email protected]>
> > ---
> > v2: Commit log change
> >
> > drivers/pci/pci.c | 4 ++++
> > drivers/pci/quirks.c | 11 +++++++++++
> > 2 files changed, 15 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index d2ff987585855..79853b52658a2 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -3330,6 +3330,10 @@ static void pci_std_enable_acs(struct pci_dev *dev)
> > /* Upstream Forwarding */
> > ctrl |= (cap & PCI_ACS_UF);
> >
> > + if (dev->external_facing || dev->untrusted)
> > + /* Translation Blocking */
> > + ctrl |= (cap & PCI_ACS_TB);
> > +
> > pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
> > }
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index b341628e47527..6294adeac4049 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -4934,6 +4934,13 @@ static void pci_quirk_enable_intel_rp_mpc_acs(struct pci_dev *dev)
> > }
> > }
> >
> > +/*
> > + * Currently this quirk does the equivalent of
> > + * PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV
> > + *
> > + * Currently missing, it also needs to do equivalent of PCI_ACS_TB,
> > + * if dev->external_facing || dev->untrusted
>
> I don't understand this comment. Is this a "TODO"? Is there
> something more that needs to be done here?

Yes. I'll mark it as a TODO to make it more clear.

>
> After a patch is applied, a comment should describe the code as it is.
>
> > + */
> > static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
> > {
> > if (!pci_quirk_intel_pch_acs_match(dev))
> > @@ -4973,6 +4980,10 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
> > ctrl |= (cap & PCI_ACS_CR);
> > ctrl |= (cap & PCI_ACS_UF);
> >
> > + if (dev->external_facing || dev->untrusted)
> > + /* Translation Blocking */
> > + ctrl |= (cap & PCI_ACS_TB);
> > +
> > pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl);
> >
> > pci_info(dev, "Intel SPT PCH root port ACS workaround enabled\n");
> > --
> > 2.27.0.212.ge8ba1cc988-goog
> >

2020-07-06 23:23:14

by Rajat Jain

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] PCI/ACS: Enable PCI_ACS_TB for untrusted/external-facing devices

On Mon, Jul 6, 2020 at 10:07 AM Bjorn Helgaas <[email protected]> wrote:
>
> On Mon, Jun 29, 2020 at 09:49:39PM -0700, Rajat Jain wrote:
> > When enabling ACS, enable translation blocking for external facing ports
> > and untrusted devices.
> >
> > Signed-off-by: Rajat Jain <[email protected]>
> > ---
> > v2: Commit log change
> >
> > drivers/pci/pci.c | 4 ++++
> > drivers/pci/quirks.c | 11 +++++++++++
> > 2 files changed, 15 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index d2ff987585855..79853b52658a2 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -3330,6 +3330,10 @@ static void pci_std_enable_acs(struct pci_dev *dev)
> > /* Upstream Forwarding */
> > ctrl |= (cap & PCI_ACS_UF);
> >
> > + if (dev->external_facing || dev->untrusted)
> > + /* Translation Blocking */
> > + ctrl |= (cap & PCI_ACS_TB);
> > +
> > pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
> > }
> >
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index b341628e47527..6294adeac4049 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -4934,6 +4934,13 @@ static void pci_quirk_enable_intel_rp_mpc_acs(struct pci_dev *dev)
> > }
> > }
> >
> > +/*
> > + * Currently this quirk does the equivalent of
> > + * PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV
>
> Nit: Reorder these as in c8de8ed2dcaa ("PCI: Make ACS quirk
> implementations more uniform") so they match other similar lists in
> the code.

Will do.

>
> But more to the point: we have a bunch of other quirks for devices
> that do not have an ACS capability but *do* provide some ACS-like
> features. Most of them support
>
> PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF
>
> because that's what we usually want. But I bet some of them also
> actually provide the equivalent of PCI_ACS_TB.
>
> REQ_ACS_FLAGS doesn't include PCI_ACS_TB. Is there anything we need
> to do on the pci_acs_enabled() side to check for PCI_ACS_TB, and
> consequently, to update any of the quirks for devices that provide it?

I'm actually not sure.
+Alex Williamson , do you have any comments here?

Thanks,

Rajat

>
> > + *
> > + * Currently missing, it also needs to do equivalent of PCI_ACS_TB,
> > + * if dev->external_facing || dev->untrusted
> > + */
> > static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
> > {
> > if (!pci_quirk_intel_pch_acs_match(dev))
> > @@ -4973,6 +4980,10 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
> > ctrl |= (cap & PCI_ACS_CR);
> > ctrl |= (cap & PCI_ACS_UF);
> >
> > + if (dev->external_facing || dev->untrusted)
> > + /* Translation Blocking */
> > + ctrl |= (cap & PCI_ACS_TB);
> > +
> > pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl);
> >
> > pci_info(dev, "Intel SPT PCH root port ACS workaround enabled\n");
> > --
> > 2.27.0.212.ge8ba1cc988-goog
> >

2020-07-06 23:31:22

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] PCI: Set "untrusted" flag for truly external devices only

On Mon, Jul 06, 2020 at 03:31:47PM -0700, Rajat Jain wrote:
> On Mon, Jul 6, 2020 at 9:38 AM Bjorn Helgaas <[email protected]> wrote:
> > On Mon, Jun 29, 2020 at 09:49:38PM -0700, Rajat Jain wrote:

> > > -static void pci_acpi_set_untrusted(struct pci_dev *dev)
> > > +static void pci_acpi_set_external_facing(struct pci_dev *dev)
> > > {
> > > u8 val;
> > >
> > > - if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT)
> > > + if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT &&
> > > + pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM)
> >
> > This looks like a change worthy of its own patch. We used to look for
> > "ExternalFacingPort" only on Root Ports; now we'll also do it for
> > Switch Downstream Ports.
>
> Can do. (please see below)
>
> > Can you include DT and ACPI spec references if they exist? I found
> > this mention:
> > https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports
> > which actually says it should only be implemented for Root Ports.
>
> I actually have no references. It seems to me that the microsoft spec
> assumes that all external ports must be implemented on root ports, but
> I think it would be equally fair for systems with PCIe switches to
> implement one on one of their switch downstream ports. I don't have an
> immediate use of this anyway, so if you think this should rather wait
> unless someone really has this case, this can wait. Let me know.

I agree that it "makes sense" to pay attention to this property no
matter where it appears, but since that Microsoft doc went to the
trouble to restrict it to Root Ports, I think we should leave this
as-is and only look for it in the Root Port. Otherwise Linux will
accept something Windows will reject, and that seems like a needless
difference.

We can at least include the above link to the Microsoft doc in the
commit log.

> > It also mentions a "DmaProperty" that looks related. Maybe Linux
> > should also pay attention to this?
>
> Interesting. Since this is not in use currently by the kernel as well
> as not exposed by (our) BIOS, I don't have an immediate use case for
> this. I'd like to defer this for later (as-the-need-arises).

I agree, you can defer this until you see a need for it. I just
pointed it out in case it would be useful to you.

> > > + /*
> > > + * Devices are marked as external-facing using info from platform
> > > + * (ACPI / devicetree). An external-facing device is still an internal
> > > + * trusted device, but it faces external untrusted devices. Thus any
> > > + * devices enumerated downstream an external-facing device is marked
> > > + * as untrusted.
> >
> > This comment has a subject/verb agreement problem.
>
> I assume you meant s/is/are/ in last sentence. Will do.

Right. There's also something wrong with "enumerated downstream an".

2020-07-06 23:36:32

by Rajat Jain

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] PCI: Add device even if driver attach failed

On Tue, Jun 30, 2020 at 1:02 AM Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Mon, Jun 29, 2020 at 09:49:40PM -0700, Rajat Jain wrote:
> > device_attach() returning failure indicates a driver error while trying to
> > probe the device. In such a scenario, the PCI device should still be added
> > in the system and be visible to the user.
> >
> > This patch partially reverts:
> > commit ab1a187bba5c ("PCI: Check device_attach() return value always")
> >
> > Signed-off-by: Rajat Jain <[email protected]>
> > Reviewed-by: Greg Kroah-Hartman <[email protected]>
> > ---
> > v2: Cosmetic change in commit log.
> > Add Greg's "reviewed-by"
> >
> > drivers/pci/bus.c | 6 +-----
> > 1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> > index 8e40b3e6da77d..3cef835b375fd 100644
> > --- a/drivers/pci/bus.c
> > +++ b/drivers/pci/bus.c
> > @@ -322,12 +322,8 @@ void pci_bus_add_device(struct pci_dev *dev)
> >
> > dev->match_driver = true;
> > retval = device_attach(&dev->dev);
> > - if (retval < 0 && retval != -EPROBE_DEFER) {
> > + if (retval < 0 && retval != -EPROBE_DEFER)
> > pci_warn(dev, "device attach failed (%d)\n", retval);
> > - pci_proc_detach_device(dev);
> > - pci_remove_sysfs_dev_files(dev);
> > - return;
> > - }
> >
> > pci_dev_assign_added(dev, true);
> > }
>
> This should go first in the series, and cc: stable and get merged now.
> No need to tie it to this series at all.
>
> Or just an independant patch, it doesn't have much to do with this
> series, it's a bugfix.

Resent this patch as an independent patch with cc:stable here:
https://lore.kernel.org/patchwork/patch/1268456/

Thanks,

Rajat

>
> thanks,
>
> greg k-h

2020-07-06 23:42:06

by Rajat Jain

[permalink] [raw]
Subject: Re: [PATCH v2 2/7] PCI: Set "untrusted" flag for truly external devices only

Hello Bjorn,

On Mon, Jul 6, 2020 at 4:30 PM Bjorn Helgaas <[email protected]> wrote:
>
> On Mon, Jul 06, 2020 at 03:31:47PM -0700, Rajat Jain wrote:
> > On Mon, Jul 6, 2020 at 9:38 AM Bjorn Helgaas <[email protected]> wrote:
> > > On Mon, Jun 29, 2020 at 09:49:38PM -0700, Rajat Jain wrote:
>
> > > > -static void pci_acpi_set_untrusted(struct pci_dev *dev)
> > > > +static void pci_acpi_set_external_facing(struct pci_dev *dev)
> > > > {
> > > > u8 val;
> > > >
> > > > - if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT)
> > > > + if (pci_pcie_type(dev) != PCI_EXP_TYPE_ROOT_PORT &&
> > > > + pci_pcie_type(dev) != PCI_EXP_TYPE_DOWNSTREAM)
> > >
> > > This looks like a change worthy of its own patch. We used to look for
> > > "ExternalFacingPort" only on Root Ports; now we'll also do it for
> > > Switch Downstream Ports.
> >
> > Can do. (please see below)
> >
> > > Can you include DT and ACPI spec references if they exist? I found
> > > this mention:
> > > https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports
> > > which actually says it should only be implemented for Root Ports.
> >
> > I actually have no references. It seems to me that the microsoft spec
> > assumes that all external ports must be implemented on root ports, but
> > I think it would be equally fair for systems with PCIe switches to
> > implement one on one of their switch downstream ports. I don't have an
> > immediate use of this anyway, so if you think this should rather wait
> > unless someone really has this case, this can wait. Let me know.
>
> I agree that it "makes sense" to pay attention to this property no
> matter where it appears, but since that Microsoft doc went to the
> trouble to restrict it to Root Ports, I think we should leave this
> as-is and only look for it in the Root Port. Otherwise Linux will
> accept something Windows will reject, and that seems like a needless
> difference.
>
> We can at least include the above link to the Microsoft doc in the
> commit log.

Will do.

>
> > > It also mentions a "DmaProperty" that looks related. Maybe Linux
> > > should also pay attention to this?
> >
> > Interesting. Since this is not in use currently by the kernel as well
> > as not exposed by (our) BIOS, I don't have an immediate use case for
> > this. I'd like to defer this for later (as-the-need-arises).
>
> I agree, you can defer this until you see a need for it. I just
> pointed it out in case it would be useful to you.
>
> > > > + /*
> > > > + * Devices are marked as external-facing using info from platform
> > > > + * (ACPI / devicetree). An external-facing device is still an internal
> > > > + * trusted device, but it faces external untrusted devices. Thus any
> > > > + * devices enumerated downstream an external-facing device is marked
> > > > + * as untrusted.
> > >
> > > This comment has a subject/verb agreement problem.
> >
> > I assume you meant s/is/are/ in last sentence. Will do.
>
> Right. There's also something wrong with "enumerated downstream an".

I'm apparently really bad at English :-). This is what I have in my
latest patch I am about to send out:

"Thus any device enumerated downstream an external-facing device, is
marked as untrusted."

Are you suggesting s/an/a/ ? Please let me know what you would like to
see and I'd copy it as-is :-)

Thanks!

Rajat

2020-07-07 06:04:38

by Rajat Jain

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] driver core: Add device location to "struct device" and expose it in sysfs

On Wed, Jul 1, 2020 at 10:23 PM Oliver O'Halloran <[email protected]> wrote:
>
> On Thu, Jul 2, 2020 at 4:07 AM Rajat Jain <[email protected]> wrote:
> >
> > *snip*
> >
> > > > I guess it would make sense to have an attribute for user space to
> > > > write to in order to make the kernel reject device plug-in events
> > > > coming from a given port or connector, but the kernel has no reliable
> > > > means to determine *which* ports or connectors are "safe", and even if
> > > > there was a way for it to do that, it still may not agree with user
> > > > space on which ports or connectors should be regarded as "safe".
> > >
> > > Again, we have been doing this for USB devices for a very long time, PCI
> > > shouldn't be any different. Why people keep ignoring working solutions
> > > is beyond me, there's nothing "special" about PCI devices here for this
> > > type of "worry" or reasoning to try to create new solutions.
> > >
> > > So, again, I ask, go do what USB does, and to do that, take the logic
> > > out of the USB core, make it bus-agnositic, and _THEN_ add it to the PCI
> > > code. Why the original submitter keeps ignoring my request to do this
> > > is beyond me, I guess they like making patches that will get rejected :(
> >
> > IMHO I'm actually trying to precisely do what I think was the
> > conclusion of our discussion, and then some changes because of the
> > further feedback I received on those patches. Let's take a step back
> > and please allow me to explain how I got here (my apologies but this
> > spans a couple of threads, and I"m trying to tie them all together
> > here):
>
> The previous thread had some suggestions, but no real conclusions.
> That's probably why we're still arguing about it...
>
> > GOAL: To allow user space to control what (PCI) drivers he wants to
> > allow on external (thunderbolt) ports. There was a lot of debate about
> > the need for such a policy at
> > https://lore.kernel.org/linux-pci/CACK8Z6GR7-wseug=TtVyRarVZX_ao2geoLDNBwjtB+5Y7VWNEQ@mail.gmail.com/
> > with the final conclusion that it should be OK to implement such a
> > policy in userspace, as long as the policy is not implemented in the
> > kernel. The kernel only needs to expose bits & info that is needed by
> > the userspace to implement such a policy, and it can be used in
> > conjunction with "drivers_autoprobe" to implement this policy:
> > --------------------------------------------------------------------
> > ....
> > That's an odd thing, but sure, if you want to write up such a policy for
> > your systems, great. But that policy does not belong in the kernel, it
> > belongs in userspace.
> > ....
> > --------------------------------------------------------------------
> > 1) The post https://lore.kernel.org/linux-pci/20200609210400.GA1461839@bjorn-Precision-5520/
> > lists out the approach that was agreed on. Replicating it here:
> > -----------------------------------------------------------------------
> > - Expose the PCI pdev->untrusted bit in sysfs. We don't expose this
> > today, but doing so would be trivial. I think I would prefer a
> > sysfs name like "external" so it's more descriptive and less of a
> > judgment.
> >
> > This comes from either the DT "external-facing" property or the
> > ACPI "ExternalFacingPort" property.
> >
> > - All devices present at boot are enumerated. Any statically built
> > drivers will bind to them before any userspace code runs.
> >
> > If you want to keep statically built drivers from binding, you'd
> > need to invent some mechanism so pci_driver_init() could clear
> > drivers_autoprobe after registering pci_bus_type.
> >
> > - Early userspace code prevents modular drivers from automatically
> > binding to PCI devices:
> >
> > echo 0 > /sys/bus/pci/drivers_autoprobe
> >
> > This prevents modular drivers from binding to all devices, whether
> > present at boot or hot-added.
> >
> > - Userspace code uses the sysfs "bind" file to control which drivers
> > are loaded and can bind to each device, e.g.,
> >
> > echo 0000:02:00.0 > /sys/bus/pci/drivers/nvme/bind
>
> I think this is a reasonable suggestion. However, as Greg pointed out
> it's gratuitously different to what USB does for no real reason.
>
> > -----------------------------------------------------------------------
> > 2) As part of implementing the above agreed approach, when I exposed
> > PCI "untrusted" attribute to userspace, it ran into discussion that
> > concluded that instead of this, the device core should be enhanced
> > with a location attribute.
> > https://lore.kernel.org/linux-pci/[email protected]/
> > -----------------------------------------------------------------------
> > ...
> > The attribute should be called something like "location" or something
> > like that (naming is hard), as you don't always know if something is
> > external or not (it could be internal, it could be unknown, it could be
> > internal to an external device that you trust (think PCI drawers for
> > "super" computers that are hot pluggable but yet really part of the
> > internal bus).
> > ....
> > "trust" has no direct relation to the location, except in a policy of
> > what you wish to do with that device, so as long as you keep them
> > separate that way, I am fine with it.
> > ...
> > -----------------------------------------------------------------------
> >
> > And hence this patch. I don't see an attribute in USB comparable to
> > this new attribute, except for the boolean "removable" may be. Are you
> > suggesting to pull that into the device core instead of adding this
> > "physical_location" attribute?
>
> He's suggesting you pull the "authorized" attribute into the driver
> core. That's the mechanism USB uses to block drivers binding unless
> userspace authorizes them. I don't see any reason why we can't re-use
> that sysfs interface for PCI devices since the problem being solved is
> fundamentally the same. The main question is what we should do as a
> default policy in the kernel. For USB the default comes from the
> "authorized_default" module param of usbcore:
>
> > /* authorized_default behaviour:
> > * -1 is authorized for all devices except wireless (old behaviour)
> > * 0 is unauthorized for all devices
> > * 1 is authorized for all devices
> > * 2 is authorized for internal devices
> > */
> > #define USB_AUTHORIZE_WIRED -1
> > #define USB_AUTHORIZE_NONE 0
> > #define USB_AUTHORIZE_ALL 1
> > #define USB_AUTHORIZE_INTERNAL 2
> >
> > static int authorized_default = USB_AUTHORIZE_WIRED;
> > module_param(authorized_default, int, S_IRUGO|S_IWUSR);
>
> So the default policy for USB is to authorize any wired USB device and
> we can optionally restrict that to just integrated devices. Sounding
> familiar?

Thank you for explaining! It is a lot more clear now :-)

I have separated out the PCI portions of this patchset (patches 1-4
i.e. ones not related to this controversial change) into its own
patchset. W.r.t patches 5-7, I think I'd like to collect my thoughts
and send out a fresh RFC once I am ready (I'm running out of time on
my deliverables so may have to carry some patches internally for the
time being). But 2 quick points:

1) Currently there are already at least 2 existing buses with their
own versions of "authorized": usb and thunderbolt, and the UAPI /
semantics of "authorized" is different amongst these.

Documentation/ABI/testing/sysfs-bus-thunderbolt - "authorized" is boolean
Documentation/usb/authorization.rst - "authorized" is 0/1/2

(Side note: In addition to that, usb also has additional "authorized"
related attributes e.g. interface_authorized_default etc which might
not have an easy corresponding sensible meaning in other buses, so we
may have to still leave it in USB.)

So my question is, assuming we do not want to change or break existing
UAPI, if I move the "authorized" attribute to the device core, who
defines the semantics of the values it can take? It seems to me like
individual buses should define that. And if so, then device core
cannot use "authorized" value to decide to prevent drivers from
binding to it?

2) It seemed to me
(https://lore.kernel.org/linux-acpi/[email protected]/)
that we had at least somewhat agreement that the location of a device
is a useful piece of info to have for userspace to have. The point I'm
trying to make is that "exporting the location of device in sysfs"
seems independent of "move untrusted attribute to the device core".
LIke you said below, location of device is still useful (may not be
sufficient, BMC case you mention) for the userspace to have, in order
to decide whether to allow a device. So why object to this patch?

Thanks,

Rajat



>
> The internal / external status is still useful to know so we might
> want to make a sysfs attribute for that too. However, I'd like to
> point out that internal / external isn't the whole story. As I
> mentioned in the last thread if I have a BMC device I *really* don't
> want it to be authorized by default even though it's an internal
> device. Similarly, if I know all my internal cards support PCIe
> Component Authentication then I might choose not to trust any PCI
> devices unless they authenticate successfully.
>
> > 3) The one deviation from the agreed approach in (1) is
> > https://patchwork.kernel.org/patch/11633095/ . The reason is I
> > realized that contrary to what I earlier believed, we might not be
> > able to disable the PCI link to all external PCI devices at boot. So
> > external PCI devices may actually bind to drivers before userspace
> > comes up and does "echo 0 > /sys/bus/pci/drivers_autoprobe").
>
> Yep, that's a problem. If we want to provide a useful mechanism to
> userspace then the default behaviour of the kernel can't undermine
> that mechanism. If that means we need another kernel command line
> parameter then I guess we just have to live with it.
>
> Oliver