¡Hola!
PCI's devres API suffers several weaknesses:
1. There are functions prefixed with pcim_. Those are always managed
counterparts to never-managed functions prefixed with pci_ – or so one
would like to think. There are some apparently unmanaged functions
(all region-request / release functions, and pci_intx()) which
suddenly become managed once the user has initialized the device with
pcim_enable_device() instead of pci_enable_device(). This "sometimes
yes, sometimes no" nature of those functions is confusing and
therefore bug-provoking. In fact, it has already caused a bug in DRM.
The last patch in this series fixes that bug.
2. iomappings: Instead of giving each mapping its own callback, the
existing API uses a statically allocated struct tracking one mapping
per bar. This is not extensible. Especially, you can't create
_ranged_ managed mappings that way, which many drivers want.
3. Managed request functions only exist as "plural versions" with a
bit-mask as a parameter. That's quite over-engineered considering
that each user only ever mapps one, maybe two bars.
This series:
- add a set of new "singular" devres functions that use devres the way
its intended, with one callback per resource.
- deprecates the existing iomap-table mechanism.
- deprecates the hybrid nature of pci_ functions.
- preserves backwards compatibility so that drivers using the existing
API won't notice any changes.
- adds documentation, especially some warning users about the
complicated nature of PCI's devres.
Note that this series is based on my "unify pci_iounmap"-series from a
few weeks ago. [1]
I tested this on a x86 VM with a simple pci test-device with two
regions. Operates and reserves resources as intended on my system.
Kasan and kmemleak didn't find any problems.
I believe this series cleans the API up as much as possible without
having to port all existing drivers to the new API. Especially, I think
that this implementation is easy to extend if the need for new managed
functions arises :)
Greetings,
P.
[1] https://lore.kernel.org/lkml/[email protected]/
Philipp Stanner (10):
pci: add new set of devres functions
pci: deprecate iomap-table functions
pci: warn users about complicated devres nature
pci: devres: make devres region requests consistent
pci: move enabled status bit to pci_dev struct
pci: move pinned status bit to pci_dev struct
pci: devres: give mwi its own callback
pci: devres: give pci(m)_intx its own callback
pci: devres: remove legacy pcim_release()
drm/vboxvideo: fix mapping leaks
Documentation/driver-api/pci/pci.rst | 3 +
drivers/gpu/drm/vboxvideo/vbox_main.c | 24 +-
drivers/pci/devres.c | 996 ++++++++++++++++++++++----
drivers/pci/iomap.c | 18 +
drivers/pci/pci.c | 124 +++-
drivers/pci/pci.h | 24 -
include/linux/pci.h | 17 +
7 files changed, 989 insertions(+), 217 deletions(-)
--
2.43.0
PCI's devres API is not extensible to ranged mappings and has
bug-provoking features. Improve that by providing better alternatives.
When the original devres API for PCI was implemented, priority was given
to the creation of a set of "pural functions" such as
pcim_request_regions(). These functions have bit masks as parameters to
specify which BARs shall get mapped. Most users, however, only use those
to mapp 1-3 BARs.
A complete set of "singular functions" does not exist.
As functions mapping / requesting multiple BARs at once have (almost) no
mechanism in C to return the resources to the caller of the plural
function, the devres API utilizes the iomap-table administrated by the
function pcim_iomap_table().
The entire PCI devres implementation was strongly tied to that table
which only allows for mapping whole, complete BARs, as the BAR's index
is used as table index. Consequently, it's not possible to, e.g., have a
pcim_iomap_range() function with that mechanism.
An additional problem is that pci-devres has been ipmlemented in a sort
of "hybrid-mode": Some unmanaged functions have managed counterparts
(e.g.: pci_iomap() <-> pcim_iomap()), making their managed nature
obvious to the programmer. However, the region-request functions in
pci.c, prefixed with pci_, behave either managed or unmanaged, depending
on whether pci_enable_device() or pcim_enable_device() has been called
in advance.
This hybrid API is confusing and should be more cleanly separated by
providing always-managed functions prefixed with pcim_.
Thus, the existing devres API is not desirable because:
a) The vast majority of the users of the plural functions only
ever sets a single bit in the bit mask, consequently making
them singular functions anyways.
b) There is no mechanism to request / iomap only part of a BAR.
c) The iomap-table mechanism is over-engineered, complicated and
can by definition not perform bounds checks, thus, provoking
memory faults: pcim_iomap_table(pdev)[42]
d) region-request functions being sometimes managed and
sometimes not is bug-provoking.
Implement a set of singular pcim_ functions that use devres directly
and bypass the legacy iomap table mechanism.
Add devres.c to driver-api documentation.
Signed-off-by: Philipp Stanner <[email protected]>
---
Documentation/driver-api/pci/pci.rst | 3 +
drivers/pci/devres.c | 481 ++++++++++++++++++++++++++-
include/linux/pci.h | 11 +
3 files changed, 490 insertions(+), 5 deletions(-)
diff --git a/Documentation/driver-api/pci/pci.rst b/Documentation/driver-api/pci/pci.rst
index 4843cfad4f60..92b11775344e 100644
--- a/Documentation/driver-api/pci/pci.rst
+++ b/Documentation/driver-api/pci/pci.rst
@@ -4,6 +4,9 @@ PCI Support Library
.. kernel-doc:: drivers/pci/pci.c
:export:
+.. kernel-doc:: drivers/pci/devres.c
+ :export:
+
.. kernel-doc:: drivers/pci/pci-driver.c
:export:
diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
index 4bd1e125bca1..cc8c1501eb13 100644
--- a/drivers/pci/devres.c
+++ b/drivers/pci/devres.c
@@ -8,10 +8,223 @@
*/
#define PCIM_IOMAP_MAX PCI_STD_NUM_BARS
+/*
+ * Legacy struct storing addresses to whole mapped bars.
+ */
struct pcim_iomap_devres {
void __iomem *table[PCIM_IOMAP_MAX];
};
+enum pcim_addr_devres_type {
+ /* Default initializer. */
+ PCIM_ADDR_DEVRES_TYPE_INVALID,
+
+ /* A region spaning an entire bar. */
+ PCIM_ADDR_DEVRES_TYPE_REGION,
+
+ /* A region spaning an entire bar, and a mapping for that whole bar. */
+ PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING,
+
+ /*
+ * A mapping within a bar, either spaning the whole bar or just a range.
+ * Without a requested region.
+ */
+ PCIM_ADDR_DEVRES_TYPE_MAPPING,
+
+ /* A ranged region within a bar, with a mapping spaning that range. */
+ PCIM_ADDR_DEVRES_TYPE_REGION_RANGE_MAPPING
+};
+
+/*
+ * This struct envelopes IO or MEM addresses, that means mappings and region
+ * requests, because those are very frequently requested and released together.
+ */
+struct pcim_addr_devres {
+ enum pcim_addr_devres_type type;
+ void __iomem *baseaddr;
+ unsigned long offset;
+ unsigned long len;
+ short bar;
+};
+
+static inline void pcim_addr_devres_clear(struct pcim_addr_devres *res)
+{
+ res->type = PCIM_ADDR_DEVRES_TYPE_INVALID;
+ res->bar = -1;
+ res->baseaddr = NULL;
+ res->offset = 0;
+ res->len = 0;
+}
+
+/*
+ * The following functions, __pcim_*_region*, exist as counterparts to the
+ * versions from pci.c - which, unfortunately, can be in "hybrid mode", i.e.,
+ * sometimes managed, sometimes not.
+ *
+ * To separate the APIs cleanly, we define our own, simplified versions here.
+ */
+
+/**
+ * __pcim_request_region_range - Request a ranged region
+ * @pdev: PCI device the region belongs to
+ * @bar: The BAR the region is within
+ * @offset: offset from the BAR's start address
+ * @maxlen: length in bytes, beginning at @offset
+ * @name: name associated with the request
+ * @exclusive: whether the mapping shall be exclusively for kernelspace
+ *
+ * Returns: 0 on success, a negative error code on failure.
+ *
+ * Request a ranged region within a device's PCI BAR. This function performs
+ * sanity checks on the input.
+ */
+static int __pcim_request_region_range(struct pci_dev *pdev, int bar,
+ unsigned long offset, unsigned long maxlen,
+ const char *name, int exclusive)
+{
+ resource_size_t start = pci_resource_start(pdev, bar);
+ resource_size_t len = pci_resource_len(pdev, bar);
+ unsigned long flags = pci_resource_flags(pdev, bar);
+
+ if (start == 0 || len == 0) /* that's an unused BAR. */
+ return 0;
+ if (len <= offset)
+ return -EINVAL;
+
+ start += offset;
+ len -= offset;
+
+ if (len > maxlen && maxlen != 0)
+ len = maxlen;
+
+ if (flags & IORESOURCE_IO) {
+ if (!request_region(start, len, name))
+ return -EBUSY;
+ } else if (flags & IORESOURCE_MEM) {
+ if (!__request_mem_region(start, len, name, exclusive))
+ return -EBUSY;
+ } else {
+ /* That's not a device we can request anything on. */
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static void __pcim_release_region_range(struct pci_dev *pdev, int bar,
+ unsigned long offset, unsigned long maxlen)
+{
+ resource_size_t start = pci_resource_start(pdev, bar);
+ resource_size_t len = pci_resource_len(pdev, bar);
+ unsigned long flags = pci_resource_flags(pdev, bar);
+
+ if (len <= offset || start == 0)
+ return;
+
+ if (len == 0 || maxlen == 0) /* This an unused BAR. Do nothing. */
+ return;
+
+ start += offset;
+ len -= offset;
+
+ if (len > maxlen)
+ len = maxlen;
+
+ if (flags & IORESOURCE_IO)
+ release_region(start, len);
+ else if (flags & IORESOURCE_MEM)
+ release_mem_region(start, len);
+}
+
+static int __pcim_request_region(struct pci_dev *pdev, int bar,
+ const char *name, int exclusive)
+{
+ const unsigned long offset = 0;
+ const unsigned long len = pci_resource_len(pdev, bar);
+
+ return __pcim_request_region_range(pdev, bar, offset, len, name, exclusive);
+}
+
+static void __pcim_release_region(struct pci_dev *pdev, int bar)
+{
+ const unsigned long offset = 0;
+ const unsigned long len = pci_resource_len(pdev, bar);
+
+ __pcim_release_region_range(pdev, bar, offset, len);
+}
+
+static void pcim_addr_resource_release(struct device *dev, void *resource_raw)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ struct pcim_addr_devres *res = resource_raw;
+
+ switch (res->type) {
+ case PCIM_ADDR_DEVRES_TYPE_REGION:
+ __pcim_release_region(pdev, res->bar);
+ break;
+ case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
+ pci_iounmap(pdev, res->baseaddr);
+ __pcim_release_region(pdev, res->bar);
+ break;
+ case PCIM_ADDR_DEVRES_TYPE_MAPPING:
+ pci_iounmap(pdev, res->baseaddr);
+ break;
+ case PCIM_ADDR_DEVRES_TYPE_REGION_RANGE_MAPPING:
+ pci_iounmap(pdev, res->baseaddr);
+ __pcim_release_region_range(pdev, res->bar, res->offset, res->len);
+ break;
+ default:
+ break;
+ }
+}
+
+static struct pcim_addr_devres *pcim_addr_devres_alloc(struct pci_dev *pdev)
+{
+ struct pcim_addr_devres *res;
+
+ res = devres_alloc_node(pcim_addr_resource_release, sizeof(*res),
+ GFP_KERNEL, dev_to_node(&pdev->dev));
+ if (res)
+ pcim_addr_devres_clear(res);
+ return res;
+}
+
+/* Just for consistency and readability. */
+static inline void pcim_addr_devres_free(struct pcim_addr_devres *res)
+{
+ devres_free(res);
+}
+
+/*
+ * Used to identify a resource in devres_*() functions.
+ */
+static int pcim_addr_resources_match(struct device *dev, void *a_raw, void *b_raw)
+{
+ struct pcim_addr_devres *a, *b;
+
+ a = a_raw;
+ b = b_raw;
+
+ (void)dev; /* unused. */
+
+ if (a->type != b->type)
+ return 0;
+
+ switch (a->type) {
+ case PCIM_ADDR_DEVRES_TYPE_REGION:
+ case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
+ return a->bar == b->bar;
+ case PCIM_ADDR_DEVRES_TYPE_MAPPING:
+ return a->baseaddr == b->baseaddr;
+ case PCIM_ADDR_DEVRES_TYPE_REGION_RANGE_MAPPING:
+ return a->bar == b->bar &&
+ a->offset == b->offset && a->len == b->len;
+ default:
+ break;
+ }
+
+ return 0;
+}
static void devm_pci_unmap_iospace(struct device *dev, void *ptr)
{
@@ -92,8 +305,8 @@ EXPORT_SYMBOL(devm_pci_remap_cfgspace);
*
* All operations are managed and will be undone on driver detach.
*
- * Returns a pointer to the remapped memory or an ERR_PTR() encoded error code
- * on failure. Usage example::
+ * Returns a pointer to the remapped memory or an IOMEM_ERR_PTR() encoded error
+ * code on failure. Usage example::
*
* res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
* base = devm_pci_remap_cfg_resource(&pdev->dev, res);
@@ -341,15 +554,80 @@ void pcim_iounmap(struct pci_dev *pdev, void __iomem *addr)
tbl[i] = NULL;
return;
}
- WARN_ON(1);
}
EXPORT_SYMBOL(pcim_iounmap);
+/**
+ * pcim_iomap_region - Request and iomap a PCI BAR
+ * @pdev: PCI device to map IO resources for
+ * @bar: Index of a BAR to map
+ * @name: Name associated with the request
+ *
+ * Returns __iomem pointer on success, an IOMEM_ERR_PTR on failure.
+ *
+ * Mapping and region will get automatically released on driver detach. If
+ * desired, release manually only with pcim_iounmap_region().
+ */
+void __iomem *pcim_iomap_region(struct pci_dev *pdev, int bar, const char *name)
+{
+ int ret = 0;
+ struct pcim_addr_devres *res;
+
+ res = pcim_addr_devres_alloc(pdev);
+ if (!res)
+ return IOMEM_ERR_PTR(-ENOMEM);
+
+ res->type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
+ res->bar = bar;
+
+ ret = __pcim_request_region(pdev, bar, name, 0);
+ if (ret != 0)
+ goto err_region;
+
+ res->baseaddr = pci_iomap(pdev, bar, 0);
+ if (!res->baseaddr) {
+ ret = -EINVAL;
+ goto err_iomap;
+ }
+
+ devres_add(&pdev->dev, res);
+ return res->baseaddr;
+
+err_iomap:
+ __pcim_release_region(pdev, bar);
+err_region:
+ pcim_addr_devres_free(res);
+
+ return IOMEM_ERR_PTR(ret);
+}
+EXPORT_SYMBOL(pcim_iomap_region);
+
+/**
+ * pcim_iounmap_region - Unmap and release a PCI BAR
+ * @pdev: PCI device to operate on
+ * @bar: Index of BAR to unmap and release
+ *
+ * Unmap a BAR and release its region manually. Only pass BARs that were
+ * previously mapped by pcim_iomap_region().
+ */
+void pcim_iounmap_region(struct pci_dev *pdev, int bar)
+{
+ struct pcim_addr_devres res_searched;
+
+ pcim_addr_devres_clear(&res_searched);
+ res_searched.type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
+ res_searched.bar = bar;
+
+ devres_release(&pdev->dev, pcim_addr_resource_release,
+ pcim_addr_resources_match, &res_searched);
+}
+EXPORT_SYMBOL(pcim_iounmap_region);
+
/**
* pcim_iomap_regions - Request and iomap PCI BARs
* @pdev: PCI device to map IO resources for
* @mask: Mask of BARs to request and iomap
- * @name: Name used when requesting regions
+ * @name: Name associated with the requests
*
* Request and iomap regions specified by @mask.
*/
@@ -402,7 +680,7 @@ EXPORT_SYMBOL(pcim_iomap_regions);
* pcim_iomap_regions_request_all - Request all BARs and iomap specified ones
* @pdev: PCI device to map IO resources for
* @mask: Mask of BARs to iomap
- * @name: Name used when requesting regions
+ * @name: Name associated with the requests
*
* Request all PCI BARs and iomap regions specified by @mask.
*/
@@ -448,3 +726,196 @@ void pcim_iounmap_regions(struct pci_dev *pdev, int mask)
}
}
EXPORT_SYMBOL(pcim_iounmap_regions);
+
+static int _pcim_request_region(struct pci_dev *pdev, int bar, const char *name,
+ int request_flags)
+{
+ int ret = 0;
+ struct pcim_addr_devres *res;
+
+ res = pcim_addr_devres_alloc(pdev);
+ if (!res)
+ return -ENOMEM;
+ res->type = PCIM_ADDR_DEVRES_TYPE_REGION;
+ res->bar = bar;
+
+ ret = __pcim_request_region(pdev, bar, name, request_flags);
+ if (ret != 0) {
+ pcim_addr_devres_free(res);
+ return ret;
+ }
+
+ devres_add(&pdev->dev, res);
+ return 0;
+}
+
+/**
+ * pcim_request_region - Request a PCI BAR
+ * @pdev: PCI device to requestion region for
+ * @bar: Index of BAR to request
+ * @name: Name associated with the request
+ *
+ * Returns: 0 on success, a negative error code on failure.
+ *
+ * Request region specified by @bar.
+ *
+ * The region will automatically be released on driver detach. If desired,
+ * release manually only with pcim_release_region().
+ */
+int pcim_request_region(struct pci_dev *pdev, int bar, const char *name)
+{
+ return _pcim_request_region(pdev, bar, name, 0);
+}
+EXPORT_SYMBOL(pcim_request_region);
+
+/**
+ * pcim_release_region - Release a PCI BAR
+ * @pdev: PCI device to operate on
+ * @bar: Index of BAR to release
+ *
+ * Release a region manually that was previously requested by
+ * pcim_request_region().
+ */
+void pcim_release_region(struct pci_dev *pdev, int bar)
+{
+ struct pcim_addr_devres res_searched;
+
+ pcim_addr_devres_clear(&res_searched);
+ res_searched.type = PCIM_ADDR_DEVRES_TYPE_REGION;
+ res_searched.bar = bar;
+
+ devres_release(&pdev->dev, pcim_addr_resource_release,
+ pcim_addr_resources_match, &res_searched);
+}
+EXPORT_SYMBOL(pcim_release_region);
+
+/**
+ * pcim_iomap_range - Create a ranged __iomap mapping within a PCI BAR
+ * @pdev: PCI device to map IO resources for
+ * @bar: Index of the BAR
+ * @offset: Offset from the begin of the BAR
+ * @len: Length in bytes for the mapping
+ *
+ * Returns: __iomem pointer on success, an IOMEM_ERR_PTR on failure.
+ *
+ * Creates a new IO-Mapping within the specified @bar, ranging from @offset to
+ * @offset + @len.
+ *
+ * The mapping will automatically get unmapped on driver detach. If desired,
+ * release manually only with pcim_iounmap().
+ */
+void __iomem *pcim_iomap_range(struct pci_dev *pdev, int bar,
+ unsigned long offset, unsigned long len)
+{
+ void __iomem *mapping;
+ struct pcim_addr_devres *res;
+
+ res = pcim_addr_devres_alloc(pdev);
+ if (!res)
+ return IOMEM_ERR_PTR(-ENOMEM);
+
+ mapping = pci_iomap_range(pdev, bar, offset, len);
+ if (!mapping) {
+ pcim_addr_devres_free(res);
+ return IOMEM_ERR_PTR(-EINVAL);
+ }
+
+ res->type = PCIM_ADDR_DEVRES_TYPE_MAPPING;
+ res->baseaddr = mapping;
+
+ /*
+ * Ranged mappings don't get added to the legacy-table, since the table
+ * only ever keeps track of whole BARs.
+ */
+
+ devres_add(&pdev->dev, res);
+ return mapping;
+}
+EXPORT_SYMBOL(pcim_iomap_range);
+
+/**
+ * pcim_iomap_region_range - Request and map a range within a PCI BAR
+ * @pdev: PCI device to map IO resources for
+ * @bar: Index of BAR to request within
+ * @offset: Offset from the begin of the BAR
+ * @len: Length in bytes for the mapping
+ * @name: Name associated with the request
+ *
+ * Returns: __iomem pointer on success, an IOMEM_ERR_PTR on failure.
+ *
+ * Request region with a range specified by @offset and @len within @bar and
+ * iomap it.
+ *
+ * The region will automatically be released and the mapping be unmapped on
+ * driver detach. If desired, release manually only with
+ * pcim_iounmap_region_range().
+ *
+ * You probably should only use this function if you explicitly do not want to
+ * request the entire BAR. For most use-cases, combining pcim_request_region()
+ * and pcim_iomap_range() should be sufficient.
+ */
+void __iomem *pcim_iomap_region_range(struct pci_dev *pdev, int bar,
+ unsigned long offset, unsigned long len, const char *name)
+{
+ int ret = 0;
+ struct pcim_addr_devres *res;
+
+ res = pcim_addr_devres_alloc(pdev);
+ if (!res)
+ return IOMEM_ERR_PTR(-ENOMEM);
+
+ res->type = PCIM_ADDR_DEVRES_TYPE_REGION_RANGE_MAPPING;
+ res->bar = bar;
+ res->offset = offset;
+ res->len = len;
+
+ ret = __pcim_request_region_range(pdev, bar, offset, len, name, 0);
+ if (ret != 0)
+ goto err_region;
+
+ res->baseaddr = pci_iomap_range(pdev, bar, offset, len);
+ if (!res->baseaddr) {
+ ret = -EINVAL;
+ goto err_iomap;
+ }
+
+ devres_add(&pdev->dev, res);
+ return res->baseaddr;
+
+err_iomap:
+ __pcim_release_region_range(pdev, bar, offset, len);
+err_region:
+ pcim_addr_devres_free(res);
+
+ return IOMEM_ERR_PTR(ret);
+}
+EXPORT_SYMBOL(pcim_iomap_region_range);
+
+/**
+ * pcim_iounmap_region_range - Unmap and release a range within a PCI BAR
+ * @pdev: PCI device to operate on
+ * @bar: Index of BAR containing the range
+ * @offset: Offset from the begin of the BAR
+ * @len: Length in bytes for the mapping
+ *
+ * Unmaps and releases a memory area within the specified PCI BAR.
+ *
+ * This function may not be used to free only part of a range. Only use this
+ * function with the exact parameters you previously used successfully in
+ * pcim_iomap_region_range().
+ */
+void pcim_iounmap_region_range(struct pci_dev *pdev, int bar,
+ unsigned long offset, unsigned long len)
+{
+ struct pcim_addr_devres res_searched;
+ pcim_addr_devres_clear(&res_searched);
+
+ res_searched.type = PCIM_ADDR_DEVRES_TYPE_REGION_RANGE_MAPPING;
+ res_searched.bar = bar;
+ res_searched.offset = offset;
+ res_searched.len = len;
+
+ devres_release(&pdev->dev, pcim_addr_resource_release,
+ pcim_addr_resources_match, &res_searched);
+}
+EXPORT_SYMBOL(pcim_iounmap_region_range);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 58a4c976c39b..1b45a4888703 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2314,10 +2314,21 @@ static inline void pci_fixup_device(enum pci_fixup_pass pass,
void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned long maxlen);
void pcim_iounmap(struct pci_dev *pdev, void __iomem *addr);
void __iomem * const *pcim_iomap_table(struct pci_dev *pdev);
+void __iomem *pcim_iomap_region(struct pci_dev *pdev, int bar, const char *name);
+void pcim_iounmap_region(struct pci_dev *pdev, int bar);
int pcim_iomap_regions(struct pci_dev *pdev, int mask, const char *name);
int pcim_iomap_regions_request_all(struct pci_dev *pdev, int mask,
const char *name);
void pcim_iounmap_regions(struct pci_dev *pdev, int mask);
+int pcim_request_region(struct pci_dev *pdev, int bar, const char *res_name);
+void pcim_release_region(struct pci_dev *pdev, int bar);
+void __iomem *pcim_iomap_range(struct pci_dev *pdev, int bar,
+ unsigned long offset, unsigned long len);
+void __iomem *pcim_iomap_region_range(struct pci_dev *pdev, int bar,
+ unsigned long offset, unsigned long len,
+ const char *res_name);
+void pcim_iounmap_region_range(struct pci_dev *pdev, int bar,
+ unsigned long offset, unsigned long len);
extern int pci_pci_problems;
#define PCIPCI_FAIL 1 /* No PCI PCI DMA */
--
2.43.0
Now that pure managed region request functions are available, the
implementation of the hybrid-functions which are only sometimes managed
can be made more consistent and readable by wrapping those
always-managed functions.
Implement a new pcim_ function for exclusively requested regions.
Have the pci_request / release functions call their pcim_ counterparts.
Remove the now surplus region_mask from the devres struct.
Signed-off-by: Philipp Stanner <[email protected]>
---
drivers/pci/devres.c | 49 ++++++++++++++++++++++---------------------
drivers/pci/pci.c | 50 +++++++++++++++-----------------------------
drivers/pci/pci.h | 6 ------
include/linux/pci.h | 1 +
4 files changed, 43 insertions(+), 63 deletions(-)
diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
index e221919ebbe2..1e5cf950775d 100644
--- a/drivers/pci/devres.c
+++ b/drivers/pci/devres.c
@@ -22,18 +22,15 @@
* _sometimes_ managed (e.g. pci_request_region()).
* Consequently, in the new API, region requests performed by the pcim_
* functions are automatically cleaned up through the devres callback
- * pcim_addr_resource_release(), while requests performed by
- * pcim_enable_device() + pci_*region*() are automatically cleaned up
- * through the for-loop in pcim_release().
+ * pcim_addr_resource_release().
+ * Users utilizing pcim_enable_device() + pci_*region*() are redirected in
+ * pci.c to the managed functions here in this file. This isn't exactly
+ * perfect, but the only alternative way would be to port ALL drivers using
+ * said combination to pcim_ functions.
*
- * TODO 1:
+ * TODO:
* Remove the legacy table entirely once all calls to pcim_iomap_table() in
* the kernel have been removed.
- *
- * TODO 2:
- * Port everyone calling pcim_enable_device() + pci_*region*() to using the
- * pcim_ functions. Then, remove all devres functionality from pci_*region*()
- * functions and remove the associated cleanups described above in point #2.
*/
/*
@@ -407,21 +404,6 @@ static void pcim_release(struct device *gendev, void *res)
{
struct pci_dev *dev = to_pci_dev(gendev);
struct pci_devres *this = res;
- int i;
-
- /*
- * This is legacy code.
- * All regions requested by a pcim_ function do get released through
- * pcim_addr_resource_release(). Thanks to the hybrid nature of the pci_
- * region-request functions, this for-loop has to release the regions
- * if they have been requested by such a function.
- *
- * TODO: Remove this once all users of pcim_enable_device() PLUS
- * pci-region-request-functions have been ported to pcim_ functions.
- */
- for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
- if (this->region_mask & (1 << i))
- pci_release_region(dev, i);
if (this->mwi)
pci_clear_mwi(dev);
@@ -971,6 +953,25 @@ int pcim_request_region(struct pci_dev *pdev, int bar, const char *name)
}
EXPORT_SYMBOL(pcim_request_region);
+/**
+ * pcim_request_region_exclusive - Request a PCI BAR exclusively
+ * @pdev: PCI device to requestion region for
+ * @bar: Index of BAR to request
+ * @name: Name associated with the request
+ *
+ * Returns: 0 on success, a negative error code on failure.
+ *
+ * Request region specified by @bar exclusively.
+ *
+ * The region will automatically be released on driver detach. If desired,
+ * release manually only with pcim_release_region().
+ */
+int pcim_request_region_exclusive(struct pci_dev *pdev, int bar, const char *name)
+{
+ return _pcim_request_region(pdev, bar, name, IORESOURCE_EXCLUSIVE);
+}
+EXPORT_SYMBOL(pcim_request_region_exclusive);
+
/**
* pcim_release_region - Release a PCI BAR
* @pdev: PCI device to operate on
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7ca860acf351..66639dd754c7 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3873,7 +3873,15 @@ EXPORT_SYMBOL_GPL(pci_common_swizzle);
*/
void pci_release_region(struct pci_dev *pdev, int bar)
{
- struct pci_devres *dr;
+ /*
+ * This is done for backwards compatibility, because the old pci-devres
+ * API had a mode in which the function became managed if it had been
+ * enabled with pcim_enable_device() instead of pci_enable_device().
+ */
+ if (pci_is_managed(pdev)) {
+ pcim_release_region(pdev, bar);
+ return;
+ }
if (pci_resource_len(pdev, bar) == 0)
return;
@@ -3883,20 +3891,6 @@ void pci_release_region(struct pci_dev *pdev, int bar)
else if (pci_resource_flags(pdev, bar) & IORESOURCE_MEM)
release_mem_region(pci_resource_start(pdev, bar),
pci_resource_len(pdev, bar));
-
- /*
- * This devres utility makes this function sometimes managed
- * (when pcim_enable_device() has been called before).
- * This is bad because it conflicts with the pcim_ functions being
- * exclusively responsible for managed pci. Its "sometimes yes, sometimes
- * no" nature can cause bugs.
- *
- * TODO: Remove this once all users that use pcim_enable_device() PLUS
- * a region request function have been ported to using pcim_ functions.
- */
- dr = find_pci_dr(pdev);
- if (dr)
- dr->region_mask &= ~(1 << bar);
}
EXPORT_SYMBOL(pci_release_region);
@@ -3924,14 +3918,18 @@ EXPORT_SYMBOL(pci_release_region);
* NOTE:
* This is a "hybrid" function: Its normally unmanaged, but becomes managed
* when pcim_enable_device() has been called in advance.
- * This hybrid feature is DEPRECATED! If you need to implement a new pci
- * function that does automatic cleanup, write a new pcim_* function that uses
- * devres directly.
+ * This hybrid feature is DEPRECATED! If you want managed cleanup, use the
+ * pcim_* functions instead.
*/
static int __pci_request_region(struct pci_dev *pdev, int bar,
const char *res_name, int exclusive)
{
- struct pci_devres *dr;
+ if (pci_is_managed(pdev)) {
+ if (exclusive == IORESOURCE_EXCLUSIVE)
+ return pcim_request_region_exclusive(pdev, bar, res_name);
+ else
+ return pcim_request_region(pdev, bar, res_name);
+ }
if (pci_resource_len(pdev, bar) == 0)
return 0;
@@ -3947,20 +3945,6 @@ static int __pci_request_region(struct pci_dev *pdev, int bar,
goto err_out;
}
- /*
- * This devres utility makes this function sometimes managed
- * (when pcim_enable_device() has been called before).
- * This is bad because it conflicts with the pcim_ functions being
- * exclusively responsible for managed pci. Its "sometimes yes, sometimes
- * no" nature can cause bugs.
- *
- * TODO: Remove this once all users that use pcim_enable_device() PLUS
- * a region request function have been ported to using pcim_ functions.
- */
- dr = find_pci_dr(pdev);
- if (dr)
- dr->region_mask |= 1 << bar;
-
return 0;
err_out:
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4d4bcc2d850f..7c2181677760 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -814,12 +814,6 @@ struct pci_devres {
unsigned int orig_intx:1;
unsigned int restore_intx:1;
unsigned int mwi:1;
-
- /*
- * TODO: remove the region_mask once everyone calling
- * pcim_enable_device() + pci_*region*() is ported to pcim_ functions.
- */
- u32 region_mask;
};
struct pci_devres *find_pci_dr(struct pci_dev *pdev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3dcd8a5e10b5..641ee30f7d2d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2321,6 +2321,7 @@ int pcim_iomap_regions_request_all(struct pci_dev *pdev, int mask,
const char *name);
void pcim_iounmap_regions(struct pci_dev *pdev, int mask);
int pcim_request_region(struct pci_dev *pdev, int bar, const char *res_name);
+int pcim_request_region_exclusive(struct pci_dev *pdev, int bar, const char *name);
void pcim_release_region(struct pci_dev *pdev, int bar);
void pcim_release_all_regions(struct pci_dev *pdev);
int pcim_request_all_regions(struct pci_dev *pdev, const char *name);
--
2.43.0
The bit describing whether the PCI device is currently enabled is stored
in the devres-struct. Besides this struct being subject of a cleanup
process, the device-struct is in general the right place to store this
information, since it is not devres-specific.
Move the 'enabled' boolean bit to struct pci_dev.
Signed-off-by: Philipp Stanner <[email protected]>
---
drivers/pci/devres.c | 11 ++++-------
drivers/pci/pci.c | 17 ++++++++++-------
drivers/pci/pci.h | 1 -
include/linux/pci.h | 1 +
4 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
index 1e5cf950775d..bf957f0bc5ac 100644
--- a/drivers/pci/devres.c
+++ b/drivers/pci/devres.c
@@ -411,7 +411,7 @@ static void pcim_release(struct device *gendev, void *res)
if (this->restore_intx)
pci_intx(dev, this->orig_intx);
- if (this->enabled && !this->pinned)
+ if (!this->pinned)
pci_disable_device(dev);
}
@@ -456,14 +456,11 @@ int pcim_enable_device(struct pci_dev *pdev)
dr = get_pci_dr(pdev);
if (unlikely(!dr))
return -ENOMEM;
- if (dr->enabled)
- return 0;
rc = pci_enable_device(pdev);
- if (!rc) {
+ if (!rc)
pdev->is_managed = 1;
- dr->enabled = 1;
- }
+
return rc;
}
EXPORT_SYMBOL(pcim_enable_device);
@@ -481,7 +478,7 @@ void pcim_pin_device(struct pci_dev *pdev)
struct pci_devres *dr;
dr = find_pci_dr(pdev);
- WARN_ON(!dr || !dr->enabled);
+ WARN_ON(!dr || !pdev->enabled);
if (dr)
dr->pinned = 1;
}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 66639dd754c7..0fb9e7bb0e43 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1953,6 +1953,9 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
u16 cmd;
u8 pin;
+ if (dev->enabled)
+ return 0;
+
err = pci_set_power_state(dev, PCI_D0);
if (err < 0 && err != -EIO)
return err;
@@ -1967,7 +1970,7 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
pci_fixup_device(pci_fixup_enable, dev);
if (dev->msi_enabled || dev->msix_enabled)
- return 0;
+ goto success_out;
pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
if (pin) {
@@ -1977,6 +1980,8 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
cmd & ~PCI_COMMAND_INTX_DISABLE);
}
+success_out:
+ dev->enabled = true;
return 0;
}
@@ -2146,6 +2151,9 @@ static void do_pci_disable_device(struct pci_dev *dev)
{
u16 pci_command;
+ if (!dev->enabled)
+ return;
+
pci_read_config_word(dev, PCI_COMMAND, &pci_command);
if (pci_command & PCI_COMMAND_MASTER) {
pci_command &= ~PCI_COMMAND_MASTER;
@@ -2153,6 +2161,7 @@ static void do_pci_disable_device(struct pci_dev *dev)
}
pcibios_disable_device(dev);
+ dev->enabled = false;
}
/**
@@ -2180,12 +2189,6 @@ void pci_disable_enabled_device(struct pci_dev *dev)
*/
void pci_disable_device(struct pci_dev *dev)
{
- struct pci_devres *dr;
-
- dr = find_pci_dr(dev);
- if (dr)
- dr->enabled = 0;
-
dev_WARN_ONCE(&dev->dev, atomic_read(&dev->enable_cnt) <= 0,
"disabling already-disabled device");
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 7c2181677760..dbb76a3fb0e4 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -809,7 +809,6 @@ static inline pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
* when a device is enabled using managed PCI device enable interface.
*/
struct pci_devres {
- unsigned int enabled:1;
unsigned int pinned:1;
unsigned int orig_intx:1;
unsigned int restore_intx:1;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 641ee30f7d2d..a356bdcc14cc 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -367,6 +367,7 @@ struct pci_dev {
this is D0-D3, D0 being fully
functional, and D3 being off. */
u8 pm_cap; /* PM capability offset */
+ unsigned int enabled:1; /* Whether this dev is enabled */
unsigned int imm_ready:1; /* Supports Immediate Readiness */
unsigned int pme_support:5; /* Bitmask of states from which PME#
can be generated */
--
2.43.0
The old plural devres functions tie PCI's devres implementation to the
iomap-table mechanism which unfortunately is not extensible.
As the purlal functions are almost never used with more than one bit set
in their bit-mask, deprecating them and encouraging users to use the new
singular functions instead is reasonable.
Furthermore, to make the implementation more consistent and extensible,
the plural functions should use the singular functions.
Add new wrapper to request / release all BARs.
Make the plural functions call into the singular functions.
Mark the plural functions as deprecated.
Remove as much of the iomap-table-mechanism as possible.
Add comments describing the path towards a cleaned-up API.
Signed-off-by: Philipp Stanner <[email protected]>
---
drivers/pci/devres.c | 369 +++++++++++++++++++++++++++++++++----------
drivers/pci/pci.c | 20 +++
drivers/pci/pci.h | 5 +
include/linux/pci.h | 2 +
4 files changed, 313 insertions(+), 83 deletions(-)
diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
index cc8c1501eb13..e221919ebbe2 100644
--- a/drivers/pci/devres.c
+++ b/drivers/pci/devres.c
@@ -4,15 +4,43 @@
#include "pci.h"
/*
- * PCI iomap devres
+ * On the state of PCI's devres implementation:
+ *
+ * The older devres API for PCI has two significant problems:
+ *
+ * 1. It is very strongly tied to the statically allocated mapping table in
+ * struct pcim_iomap_devres below. This is mostly solved in the sense of the
+ * pcim_ functions in this file providing things like ranged mapping by
+ * bypassing this table, wheras the functions that were present in the old
+ * API still enter the mapping addresses into the table for users of the old
+ * API.
+ * 2. The region-request-functions in pci.c do become managed IF the device has
+ * been enabled with pcim_enable_device() instead of pci_enable_device().
+ * This resulted in the API becoming inconsistent: Some functions have an
+ * obviously managed counter-part (e.g., pci_iomap() <-> pcim_iomap()),
+ * whereas some don't and are never managed, while others don't and are
+ * _sometimes_ managed (e.g. pci_request_region()).
+ * Consequently, in the new API, region requests performed by the pcim_
+ * functions are automatically cleaned up through the devres callback
+ * pcim_addr_resource_release(), while requests performed by
+ * pcim_enable_device() + pci_*region*() are automatically cleaned up
+ * through the for-loop in pcim_release().
+ *
+ * TODO 1:
+ * Remove the legacy table entirely once all calls to pcim_iomap_table() in
+ * the kernel have been removed.
+ *
+ * TODO 2:
+ * Port everyone calling pcim_enable_device() + pci_*region*() to using the
+ * pcim_ functions. Then, remove all devres functionality from pci_*region*()
+ * functions and remove the associated cleanups described above in point #2.
*/
-#define PCIM_IOMAP_MAX PCI_STD_NUM_BARS
/*
* Legacy struct storing addresses to whole mapped bars.
*/
struct pcim_iomap_devres {
- void __iomem *table[PCIM_IOMAP_MAX];
+ void __iomem *table[PCI_STD_NUM_BARS];
};
enum pcim_addr_devres_type {
@@ -381,6 +409,16 @@ static void pcim_release(struct device *gendev, void *res)
struct pci_devres *this = res;
int i;
+ /*
+ * This is legacy code.
+ * All regions requested by a pcim_ function do get released through
+ * pcim_addr_resource_release(). Thanks to the hybrid nature of the pci_
+ * region-request functions, this for-loop has to release the regions
+ * if they have been requested by such a function.
+ *
+ * TODO: Remove this once all users of pcim_enable_device() PLUS
+ * pci-region-request-functions have been ported to pcim_ functions.
+ */
for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
if (this->region_mask & (1 << i))
pci_release_region(dev, i);
@@ -469,17 +507,16 @@ EXPORT_SYMBOL(pcim_pin_device);
static void pcim_iomap_release(struct device *gendev, void *res)
{
- struct pci_dev *dev = to_pci_dev(gendev);
- struct pcim_iomap_devres *this = res;
- int i;
-
- for (i = 0; i < PCIM_IOMAP_MAX; i++)
- if (this->table[i])
- pci_iounmap(dev, this->table[i]);
+ /*
+ * Do nothing. This is legacy code.
+ *
+ * Cleanup of the mappings is now done directly through the callbacks
+ * registered when creating them.
+ */
}
/**
- * pcim_iomap_table - access iomap allocation table
+ * pcim_iomap_table - access iomap allocation table (DEPRECATED)
* @pdev: PCI device to access iomap table for
*
* Access iomap allocation table for @dev. If iomap table doesn't
@@ -490,6 +527,11 @@ static void pcim_iomap_release(struct device *gendev, void *res)
* This function might sleep when the table is first allocated but can
* be safely called without context and guaranteed to succeed once
* allocated.
+ *
+ * This function is DEPRECATED. Do not use it in new code.
+ * Instead, obtain a mapping's address directly from one of the pcim_* mapping
+ * functions. For example:
+ * void __iomem *mappy = pcim_iomap(pdev, barnr, length);
*/
void __iomem * const *pcim_iomap_table(struct pci_dev *pdev)
{
@@ -508,27 +550,109 @@ void __iomem * const *pcim_iomap_table(struct pci_dev *pdev)
}
EXPORT_SYMBOL(pcim_iomap_table);
+/*
+ * Fill the legacy mapping-table, so that drivers using the old API
+ * can still get a BAR's mapping address through pcim_iomap_table().
+ */
+static int pcim_add_mapping_to_legacy_table(struct pci_dev *pdev,
+ void __iomem *mapping, short bar)
+{
+ void __iomem **legacy_iomap_table;
+
+ if (bar >= PCI_STD_NUM_BARS)
+ return -EINVAL;
+
+ legacy_iomap_table = (void __iomem **)pcim_iomap_table(pdev);
+ if (!legacy_iomap_table)
+ return -ENOMEM;
+
+ /* The legacy mechanism doesn't allow for duplicate mappings. */
+ WARN_ON(legacy_iomap_table[bar]);
+
+ legacy_iomap_table[bar] = mapping;
+
+ return 0;
+}
+
+/*
+ * Removes a mapping. The table only contains whole-bar-mappings, so this will
+ * never interfere with ranged mappings.
+ */
+static void pcim_remove_mapping_from_legacy_table(struct pci_dev *pdev,
+ void __iomem *addr)
+{
+ short bar;
+ void __iomem **legacy_iomap_table;
+
+ legacy_iomap_table = (void __iomem **)pcim_iomap_table(pdev);
+ if (!legacy_iomap_table)
+ return;
+
+ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+ if (legacy_iomap_table[bar] == addr) {
+ legacy_iomap_table[bar] = NULL;
+ return;
+ }
+ }
+}
+
+/*
+ * The same as pcim_remove_mapping_from_legacy_table(), but identifies the
+ * mapping by its BAR index.
+ */
+static void pcim_remove_bar_from_legacy_table(struct pci_dev *pdev, short bar)
+{
+ void __iomem **legacy_iomap_table;
+
+ if (bar >= PCI_STD_NUM_BARS)
+ return;
+
+ legacy_iomap_table = (void __iomem **)pcim_iomap_table(pdev);
+ if (!legacy_iomap_table)
+ return;
+
+ legacy_iomap_table[bar] = NULL;
+}
+
/**
* pcim_iomap - Managed pcim_iomap()
* @pdev: PCI device to iomap for
* @bar: BAR to iomap
* @maxlen: Maximum length of iomap
*
- * Managed pci_iomap(). Map is automatically unmapped on driver
- * detach.
+ * Returns __iomem pointer on success, NULL on failure.
+ *
+ * Managed pci_iomap(). Map is automatically unmapped on driver detach. If
+ * desired, unmap manually only with pcim_iounmap().
+ *
+ * This SHOULD only be used once per BAR.
*/
void __iomem *pcim_iomap(struct pci_dev *pdev, int bar, unsigned long maxlen)
{
- void __iomem **tbl;
-
- BUG_ON(bar >= PCIM_IOMAP_MAX);
+ void __iomem *mapping;
+ struct pcim_addr_devres *res;
- tbl = (void __iomem **)pcim_iomap_table(pdev);
- if (!tbl || tbl[bar]) /* duplicate mappings not allowed */
+ res = pcim_addr_devres_alloc(pdev);
+ if (!res)
return NULL;
+ res->type = PCIM_ADDR_DEVRES_TYPE_MAPPING;
- tbl[bar] = pci_iomap(pdev, bar, maxlen);
- return tbl[bar];
+ mapping = pci_iomap(pdev, bar, maxlen);
+ if (!mapping)
+ goto err_iomap;
+ res->baseaddr = mapping;
+
+ if (pcim_add_mapping_to_legacy_table(pdev, mapping, bar) != 0)
+ goto err_table;
+
+ devres_add(&pdev->dev, res);
+ return mapping;
+
+err_table:
+ pci_iounmap(pdev, mapping);
+err_iomap:
+ pcim_addr_devres_free(res);
+ return NULL;
}
EXPORT_SYMBOL(pcim_iomap);
@@ -537,23 +661,24 @@ EXPORT_SYMBOL(pcim_iomap);
* @pdev: PCI device to iounmap for
* @addr: Address to unmap
*
- * Managed pci_iounmap(). @addr must have been mapped using pcim_iomap().
+ * Managed pci_iounmap(). @addr must have been mapped using pcim_iomap() or
+ * pcim_iomap_range().
*/
void pcim_iounmap(struct pci_dev *pdev, void __iomem *addr)
{
- void __iomem **tbl;
- int i;
+ struct pcim_addr_devres res_searched;
- pci_iounmap(pdev, addr);
+ pcim_addr_devres_clear(&res_searched);
+ res_searched.type = PCIM_ADDR_DEVRES_TYPE_MAPPING;
+ res_searched.baseaddr = addr;
- tbl = (void __iomem **)pcim_iomap_table(pdev);
- BUG_ON(!tbl);
+ if (devres_release(&pdev->dev, pcim_addr_resource_release,
+ pcim_addr_resources_match, &res_searched) != 0) {
+ /* Doesn't exist. User passed nonsense. */
+ return;
+ }
- for (i = 0; i < PCIM_IOMAP_MAX; i++)
- if (tbl[i] == addr) {
- tbl[i] = NULL;
- return;
- }
+ pcim_remove_mapping_from_legacy_table(pdev, addr);
}
EXPORT_SYMBOL(pcim_iounmap);
@@ -623,106 +748,184 @@ void pcim_iounmap_region(struct pci_dev *pdev, int bar)
}
EXPORT_SYMBOL(pcim_iounmap_region);
+static inline bool mask_contains_bar(int mask, int bar)
+{
+ return mask & (1 << bar);
+}
+
/**
- * pcim_iomap_regions - Request and iomap PCI BARs
+ * pcim_iomap_regions - Request and iomap PCI BARs (DEPRECATED)
* @pdev: PCI device to map IO resources for
* @mask: Mask of BARs to request and iomap
* @name: Name associated with the requests
*
+ * Returns 0 on success, negative error code on failure.
+ *
* Request and iomap regions specified by @mask.
+ *
+ * This function is DEPRECATED. Don't use it in new code.
+ * Use pcim_iomap_region() instead.
*/
int pcim_iomap_regions(struct pci_dev *pdev, int mask, const char *name)
{
- void __iomem * const *iomap;
- int i, rc;
+ int ret = -1;
+ short bar;
+ void __iomem *mapping;
- iomap = pcim_iomap_table(pdev);
- if (!iomap)
- return -ENOMEM;
+ for (bar = 0; bar < DEVICE_COUNT_RESOURCE; bar++) {
+ if (!mask_contains_bar(mask, bar))
+ continue;
- for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
- unsigned long len;
+ mapping = pcim_iomap_region(pdev, bar, name);
+ if (IS_ERR(mapping)) {
+ ret = PTR_ERR(mapping);
+ goto err;
+ }
+ ret = pcim_add_mapping_to_legacy_table(pdev, mapping, bar);
+ if (ret != 0)
+ goto err;
+ }
- if (!(mask & (1 << i)))
- continue;
+ return 0;
- rc = -EINVAL;
- len = pci_resource_len(pdev, i);
- if (!len)
- goto err_inval;
+ err:
+ while (--bar >= 0) {
+ pcim_iounmap_region(pdev, bar);
+ pcim_remove_bar_from_legacy_table(pdev, bar);
+ }
- rc = pci_request_region(pdev, i, name);
- if (rc)
- goto err_inval;
+ return ret;
+}
+EXPORT_SYMBOL(pcim_iomap_regions);
- rc = -ENOMEM;
- if (!pcim_iomap(pdev, i, 0))
- goto err_region;
+/**
+ * pcim_release_all_regions - Release all regions of a PCI-device
+ * @pdev: the PCI device
+ *
+ * Returns 0 on success, negative error code on failure.
+ *
+ * Will release all regions previously requested through pcim_request_region()
+ * or pcim_request_all_regions().
+ *
+ * Can be called from any context, i.e., not necessarily as a counterpart to
+ * pcim_request_all_regions().
+ */
+void pcim_release_all_regions(struct pci_dev *pdev)
+{
+ short bar;
+
+ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++)
+ pcim_release_region(pdev, bar);
+}
+EXPORT_SYMBOL(pcim_release_all_regions);
+
+/**
+ * pcim_request_all_regions - Request all regions
+ * @pdev: PCI device to map IO resources for
+ * @name: name associated with the request
+ *
+ * Requested regions will automatically be released at driver detach. If desired,
+ * release individual regions with pcim_release_region() or all of them at once
+ * with pcim_release_all_regions().
+ */
+int pcim_request_all_regions(struct pci_dev *pdev, const char *name)
+{
+ int ret;
+ short bar;
+
+ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+ ret = pcim_request_region(pdev, bar, name);
+ if (ret != 0)
+ goto err;
}
return 0;
- err_region:
- pci_release_region(pdev, i);
- err_inval:
- while (--i >= 0) {
- if (!(mask & (1 << i)))
- continue;
- pcim_iounmap(pdev, iomap[i]);
- pci_release_region(pdev, i);
- }
+err:
+ pcim_release_all_regions(pdev);
- return rc;
+ return ret;
}
-EXPORT_SYMBOL(pcim_iomap_regions);
+EXPORT_SYMBOL(pcim_request_all_regions);
/**
- * pcim_iomap_regions_request_all - Request all BARs and iomap specified ones
+ * pcim_iomap_regions_request_all - Request all BARs and iomap specified ones (DEPRECATED)
* @pdev: PCI device to map IO resources for
* @mask: Mask of BARs to iomap
* @name: Name associated with the requests
*
+ * Returns 0 on success, negative error code on failure.
+ *
* Request all PCI BARs and iomap regions specified by @mask.
+ *
+ * To release these resources manually, call pcim_release_region() for the
+ * regions and pcim_iounmap() for the mappings.
+ *
+ * This function is DEPRECATED. Don't use it in new code.
+ * Use pcim_request_all_regions() + pcim_iomap*() instead.
*/
int pcim_iomap_regions_request_all(struct pci_dev *pdev, int mask,
const char *name)
{
- int request_mask = ((1 << 6) - 1) & ~mask;
- int rc;
+ short bar;
+ int ret = -1;
+ void __iomem **legacy_iomap_table;
- rc = pci_request_selected_regions(pdev, request_mask, name);
- if (rc)
- return rc;
+ ret = pcim_request_all_regions(pdev, name);
+ if (ret != 0)
+ return ret;
- rc = pcim_iomap_regions(pdev, mask, name);
- if (rc)
- pci_release_selected_regions(pdev, request_mask);
- return rc;
+ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+ if (!mask_contains_bar(mask, bar))
+ continue;
+ if (!pcim_iomap(pdev, bar, 0))
+ goto err;
+ }
+
+ return 0;
+
+err:
+ /*
+ * Here it gets tricky: pcim_iomap() above has most likely
+ * failed because it got an OOM when trying to create the
+ * legacy-table.
+ * We check here if that has happened. If not, pcim_iomap()
+ * must have failed because of EINVAL.
+ */
+ legacy_iomap_table = (void __iomem **)pcim_iomap_table(pdev);
+ if (!legacy_iomap_table)
+ ret = -ENOMEM;
+ else
+ ret = -EINVAL;
+
+ while (--bar >= 0)
+ pcim_iounmap(pdev, legacy_iomap_table[bar]);
+
+ pcim_release_all_regions(pdev);
+
+ return ret;
}
EXPORT_SYMBOL(pcim_iomap_regions_request_all);
/**
- * pcim_iounmap_regions - Unmap and release PCI BARs
+ * pcim_iounmap_regions - Unmap and release PCI BARs (DEPRECATED)
* @pdev: PCI device to map IO resources for
* @mask: Mask of BARs to unmap and release
*
* Unmap and release regions specified by @mask.
+ *
+ * This function is DEPRECATED. Don't use it in new code.
*/
void pcim_iounmap_regions(struct pci_dev *pdev, int mask)
{
- void __iomem * const *iomap;
- int i;
-
- iomap = pcim_iomap_table(pdev);
- if (!iomap)
- return;
+ short bar;
- for (i = 0; i < PCIM_IOMAP_MAX; i++) {
- if (!(mask & (1 << i)))
+ for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
+ if (!mask_contains_bar(mask, bar))
continue;
- pcim_iounmap(pdev, iomap[i]);
- pci_release_region(pdev, i);
+ pcim_iounmap_region(pdev, bar);
+ pcim_remove_bar_from_legacy_table(pdev, bar);
}
}
EXPORT_SYMBOL(pcim_iounmap_regions);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index bf5f9c3a1908..2899df77a285 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3884,6 +3884,16 @@ void pci_release_region(struct pci_dev *pdev, int bar)
release_mem_region(pci_resource_start(pdev, bar),
pci_resource_len(pdev, bar));
+ /*
+ * This devres utility makes this function sometimes managed
+ * (when pcim_enable_device() has been called before).
+ * This is bad because it conflicts with the pcim_ functions being
+ * exclusively responsible for managed pci. Its "sometimes yes, sometimes
+ * no" nature can cause bugs.
+ *
+ * TODO: Remove this once all users that use pcim_enable_device() PLUS
+ * a region request function have been ported to using pcim_ functions.
+ */
dr = find_pci_dr(pdev);
if (dr)
dr->region_mask &= ~(1 << bar);
@@ -3928,6 +3938,16 @@ static int __pci_request_region(struct pci_dev *pdev, int bar,
goto err_out;
}
+ /*
+ * This devres utility makes this function sometimes managed
+ * (when pcim_enable_device() has been called before).
+ * This is bad because it conflicts with the pcim_ functions being
+ * exclusively responsible for managed pci. Its "sometimes yes, sometimes
+ * no" nature can cause bugs.
+ *
+ * TODO: Remove this once all users that use pcim_enable_device() PLUS
+ * a region request function have been ported to using pcim_ functions.
+ */
dr = find_pci_dr(pdev);
if (dr)
dr->region_mask |= 1 << bar;
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index de6d61d1e07f..4d4bcc2d850f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -814,6 +814,11 @@ struct pci_devres {
unsigned int orig_intx:1;
unsigned int restore_intx:1;
unsigned int mwi:1;
+
+ /*
+ * TODO: remove the region_mask once everyone calling
+ * pcim_enable_device() + pci_*region*() is ported to pcim_ functions.
+ */
u32 region_mask;
};
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1b45a4888703..3dcd8a5e10b5 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2322,6 +2322,8 @@ int pcim_iomap_regions_request_all(struct pci_dev *pdev, int mask,
void pcim_iounmap_regions(struct pci_dev *pdev, int mask);
int pcim_request_region(struct pci_dev *pdev, int bar, const char *res_name);
void pcim_release_region(struct pci_dev *pdev, int bar);
+void pcim_release_all_regions(struct pci_dev *pdev);
+int pcim_request_all_regions(struct pci_dev *pdev, const char *name);
void __iomem *pcim_iomap_range(struct pci_dev *pdev, int bar,
unsigned long offset, unsigned long len);
void __iomem *pcim_iomap_region_range(struct pci_dev *pdev, int bar,
--
2.43.0
managing mwi with devres can easily be done with its own callback,
without the necessity to store any state about it in a device-related
struct.
Remove the mwi state from the devres-struct.
Make the devres-mwi functions set a separate devres-callback.
Signed-off-by: Philipp Stanner <[email protected]>
---
drivers/pci/devres.c | 23 +++++++++++------------
drivers/pci/pci.h | 1 -
2 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
index de8cf6f87dd7..911c2037d9fd 100644
--- a/drivers/pci/devres.c
+++ b/drivers/pci/devres.c
@@ -378,24 +378,26 @@ void __iomem *devm_pci_remap_cfg_resource(struct device *dev,
}
EXPORT_SYMBOL(devm_pci_remap_cfg_resource);
+static void __pcim_clear_mwi(void *pdev_raw)
+{
+ struct pci_dev *pdev = pdev_raw;
+
+ pci_clear_mwi(pdev);
+}
+
/**
* pcim_set_mwi - a device-managed pci_set_mwi()
- * @dev: the PCI device for which MWI is enabled
+ * @pdev: the PCI device for which MWI is enabled
*
* Managed pci_set_mwi().
*
* RETURNS: An appropriate -ERRNO error value on error, or zero for success.
*/
-int pcim_set_mwi(struct pci_dev *dev)
+int pcim_set_mwi(struct pci_dev *pdev)
{
- struct pci_devres *dr;
-
- dr = find_pci_dr(dev);
- if (!dr)
- return -ENOMEM;
+ devm_add_action(&pdev->dev, __pcim_clear_mwi, pdev);
- dr->mwi = 1;
- return pci_set_mwi(dev);
+ return pci_set_mwi(pdev);
}
EXPORT_SYMBOL(pcim_set_mwi);
@@ -405,9 +407,6 @@ static void pcim_release(struct device *gendev, void *res)
struct pci_dev *dev = to_pci_dev(gendev);
struct pci_devres *this = res;
- if (this->mwi)
- pci_clear_mwi(dev);
-
if (this->restore_intx)
pci_intx(dev, this->orig_intx);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 3d9908a69ebf..667bfdd61d5e 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -811,7 +811,6 @@ static inline pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
struct pci_devres {
unsigned int orig_intx:1;
unsigned int restore_intx:1;
- unsigned int mwi:1;
};
struct pci_devres *find_pci_dr(struct pci_dev *pdev);
--
2.43.0
The bit describing whether the PCI device is currently pinned is stored
in the PCI devres struct. To clean up and simplify the pci-devres API,
it's better if this information is stored in the pci_dev struct, because
it allows for checking that device's pinned-status directly through the
device struct.
This will later permit simplifying pcim_enable_device().
Move the 'pinned' boolean bit to struct pci_dev.
Signed-off-by: Philipp Stanner <[email protected]>
---
drivers/pci/devres.c | 14 ++++----------
drivers/pci/pci.h | 1 -
include/linux/pci.h | 1 +
3 files changed, 5 insertions(+), 11 deletions(-)
diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
index bf957f0bc5ac..de8cf6f87dd7 100644
--- a/drivers/pci/devres.c
+++ b/drivers/pci/devres.c
@@ -411,7 +411,7 @@ static void pcim_release(struct device *gendev, void *res)
if (this->restore_intx)
pci_intx(dev, this->orig_intx);
- if (!this->pinned)
+ if (!dev->pinned)
pci_disable_device(dev);
}
@@ -469,18 +469,12 @@ EXPORT_SYMBOL(pcim_enable_device);
* pcim_pin_device - Pin managed PCI device
* @pdev: PCI device to pin
*
- * Pin managed PCI device @pdev. Pinned device won't be disabled on
- * driver detach. @pdev must have been enabled with
- * pcim_enable_device().
+ * Pin managed PCI device @pdev. Pinned device won't be disabled on driver
+ * detach. @pdev must have been enabled with pcim_enable_device().
*/
void pcim_pin_device(struct pci_dev *pdev)
{
- struct pci_devres *dr;
-
- dr = find_pci_dr(pdev);
- WARN_ON(!dr || !pdev->enabled);
- if (dr)
- dr->pinned = 1;
+ pdev->pinned = true;
}
EXPORT_SYMBOL(pcim_pin_device);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index dbb76a3fb0e4..3d9908a69ebf 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -809,7 +809,6 @@ static inline pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
* when a device is enabled using managed PCI device enable interface.
*/
struct pci_devres {
- unsigned int pinned:1;
unsigned int orig_intx:1;
unsigned int restore_intx:1;
unsigned int mwi:1;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a356bdcc14cc..2f6f44991003 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -368,6 +368,7 @@ struct pci_dev {
functional, and D3 being off. */
u8 pm_cap; /* PM capability offset */
unsigned int enabled:1; /* Whether this dev is enabled */
+ unsigned int pinned:1; /* Whether this dev is pinned */
unsigned int imm_ready:1; /* Supports Immediate Readiness */
unsigned int pme_support:5; /* Bitmask of states from which PME#
can be generated */
--
2.43.0
pci_intx() is one of the functions that have "hybrid mode" (i.e.,
sometimes managed, sometimes not). Providing a separate pcim_intx()
function with its own device resource and cleanup callback allows for
removing further large parts of the legacy pci-devres implementation.
As in the region-request-functions, pci_intx() has to call into its
managed counterpart for backwards compatibility.
Implement pcim_intx() with its own device resource.
Make pci_intx() call pcim_intx() in the managed case.
Remove the legacy devres struct from pci.h.
Signed-off-by: Philipp Stanner <[email protected]>
---
drivers/pci/devres.c | 77 +++++++++++++++++++++++++++++++++++---------
drivers/pci/pci.c | 23 +++++++------
drivers/pci/pci.h | 20 ------------
include/linux/pci.h | 1 +
4 files changed, 75 insertions(+), 46 deletions(-)
diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
index 911c2037d9fd..7c4edcaeb8fe 100644
--- a/drivers/pci/devres.c
+++ b/drivers/pci/devres.c
@@ -40,6 +40,11 @@ struct pcim_iomap_devres {
void __iomem *table[PCI_STD_NUM_BARS];
};
+/* Used to restore the old intx state on driver detach. */
+struct pcim_intx_devres {
+ int orig_intx;
+};
+
enum pcim_addr_devres_type {
/* Default initializer. */
PCIM_ADDR_DEVRES_TYPE_INVALID,
@@ -401,31 +406,71 @@ int pcim_set_mwi(struct pci_dev *pdev)
}
EXPORT_SYMBOL(pcim_set_mwi);
+static void pcim_intx_restore(struct device *dev, void *data)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ struct pcim_intx_devres *res = data;
-static void pcim_release(struct device *gendev, void *res)
+ pci_intx(pdev, res->orig_intx);
+}
+
+static struct pcim_intx_devres *get_or_create_intx_devres(struct device *dev)
{
- struct pci_dev *dev = to_pci_dev(gendev);
- struct pci_devres *this = res;
+ struct pcim_intx_devres *res;
- if (this->restore_intx)
- pci_intx(dev, this->orig_intx);
+ res = devres_find(dev, pcim_intx_restore, NULL, NULL);
+ if (res)
+ return res;
- if (!dev->pinned)
- pci_disable_device(dev);
+ res = devres_alloc(pcim_intx_restore, sizeof(*res), GFP_KERNEL);
+ if (res)
+ devres_add(dev, res);
+
+ return res;
}
-/*
- * TODO:
- * Once the last four callers in pci.c are ported, this function here needs to
- * be made static again.
+/**
+ * pcim_intx - managed pci_intx()
+ * @pdev: the PCI device to operate on
+ * @enable: boolean: whether to enable or disable PCI INTx
+ *
+ * Returns: 0 on success, -ENOMEM on error.
+ *
+ * Enables/disables PCI INTx for device @pdev.
+ * Restores the original state on driver detach.
*/
-struct pci_devres *find_pci_dr(struct pci_dev *pdev)
+int pcim_intx(struct pci_dev *pdev, int enable)
{
- if (pci_is_managed(pdev))
- return devres_find(&pdev->dev, pcim_release, NULL, NULL);
- return NULL;
+ u16 pci_command, new;
+ struct pcim_intx_devres *res;
+
+ res = get_or_create_intx_devres(&pdev->dev);
+ if (!res)
+ return -ENOMEM;
+
+ res->orig_intx = !enable;
+
+ pci_read_config_word(pdev, PCI_COMMAND, &pci_command);
+
+ if (enable)
+ new = pci_command & ~PCI_COMMAND_INTX_DISABLE;
+ else
+ new = pci_command | PCI_COMMAND_INTX_DISABLE;
+
+ if (new != pci_command)
+ pci_write_config_word(pdev, PCI_COMMAND, new);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(pcim_intx);
+
+static void pcim_release(struct device *gendev, void *res)
+{
+ struct pci_dev *dev = to_pci_dev(gendev);
+
+ if (!dev->pinned)
+ pci_disable_device(dev);
}
-EXPORT_SYMBOL(find_pci_dr);
static struct pci_devres *get_pci_dr(struct pci_dev *pdev)
{
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0fb9e7bb0e43..a03ad3f17540 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4465,11 +4465,23 @@ void pci_disable_parity(struct pci_dev *dev)
* This is a "hybrid" function: Its normally unmanaged, but becomes managed
* when pcim_enable_device() has been called in advance.
* This hybrid feature is DEPRECATED!
+ * Use pcim_intx() if you need a managed version.
*/
void pci_intx(struct pci_dev *pdev, int enable)
{
u16 pci_command, new;
+ /*
+ * This is done for backwards compatibility, because the old pci-devres
+ * API had a mode in which this function became managed if the dev had
+ * been enabled with pcim_enable_device() instead of pci_enable_device().
+ */
+ if (pci_is_managed(pdev)) {
+ if (pcim_intx(pdev, enable) != 0)
+ WARN_ON_ONCE(1);
+ return;
+ }
+
pci_read_config_word(pdev, PCI_COMMAND, &pci_command);
if (enable)
@@ -4477,17 +4489,8 @@ void pci_intx(struct pci_dev *pdev, int enable)
else
new = pci_command | PCI_COMMAND_INTX_DISABLE;
- if (new != pci_command) {
- struct pci_devres *dr;
-
+ if (new != pci_command)
pci_write_config_word(pdev, PCI_COMMAND, new);
-
- dr = find_pci_dr(pdev);
- if (dr && !dr->restore_intx) {
- dr->restore_intx = 1;
- dr->orig_intx = !enable;
- }
- }
}
EXPORT_SYMBOL_GPL(pci_intx);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 667bfdd61d5e..f43873049d52 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -795,26 +795,6 @@ static inline pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
}
#endif
-/*
- * TODO:
- * The following two components wouldn't need to be here if they weren't used at
- * four last places in pci.c
- * Port or move these functions to devres.c and then remove the
- * pci_devres-components from this header file here.
- */
-/*
- * Managed PCI resources. This manages device on/off, INTx/MSI/MSI-X
- * on/off and BAR regions. pci_dev itself records MSI/MSI-X status, so
- * there's no need to track it separately. pci_devres is initialized
- * when a device is enabled using managed PCI device enable interface.
- */
-struct pci_devres {
- unsigned int orig_intx:1;
- unsigned int restore_intx:1;
-};
-
-struct pci_devres *find_pci_dr(struct pci_dev *pdev);
-
/*
* Config Address for PCI Configuration Mechanism #1
*
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2f6f44991003..721657cb59d8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1354,6 +1354,7 @@ int pci_try_set_mwi(struct pci_dev *dev);
void pci_clear_mwi(struct pci_dev *dev);
void pci_disable_parity(struct pci_dev *dev);
void pci_intx(struct pci_dev *dev, int enable);
+int pcim_intx(struct pci_dev *dev, int enable);
bool pci_check_and_mask_intx(struct pci_dev *dev);
bool pci_check_and_unmask_intx(struct pci_dev *dev);
int pci_wait_for_pending(struct pci_dev *dev, int pos, u16 mask);
--
2.43.0
Thanks to preceding cleanup steps, pcim_release() is now not needed
anymore and can be replaced by pcim_disable_device(), which is the exact
counterpart to pcim_enable_device().
This permits removing further parts of the old devres API.
Replace pcim_release() with pcim_disable_device().
Remove the now surplus get_dr() function.
Signed-off-by: Philipp Stanner <[email protected]>
---
drivers/pci/devres.c | 38 ++++++++------------------------------
1 file changed, 8 insertions(+), 30 deletions(-)
diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
index 7c4edcaeb8fe..87bc62be21eb 100644
--- a/drivers/pci/devres.c
+++ b/drivers/pci/devres.c
@@ -464,48 +464,26 @@ int pcim_intx(struct pci_dev *pdev, int enable)
}
EXPORT_SYMBOL_GPL(pcim_intx);
-static void pcim_release(struct device *gendev, void *res)
+static void pcim_disable_device(void *pdev_raw)
{
- struct pci_dev *dev = to_pci_dev(gendev);
-
- if (!dev->pinned)
- pci_disable_device(dev);
-}
-
-static struct pci_devres *get_pci_dr(struct pci_dev *pdev)
-{
- struct pci_devres *dr, *new_dr;
-
- dr = devres_find(&pdev->dev, pcim_release, NULL, NULL);
- if (dr)
- return dr;
+ struct pci_dev *pdev = pdev_raw;
- new_dr = devres_alloc(pcim_release, sizeof(*new_dr), GFP_KERNEL);
- if (!new_dr)
- return NULL;
- return devres_get(&pdev->dev, new_dr, NULL, NULL);
+ if (!pdev->pinned)
+ pci_disable_device(pdev);
}
/**
* pcim_enable_device - Managed pci_enable_device()
* @pdev: PCI device to be initialized
*
- * Managed pci_enable_device().
+ * Managed pci_enable_device(). Device will automatically be disabled on
+ * driver detach.
*/
int pcim_enable_device(struct pci_dev *pdev)
{
- struct pci_devres *dr;
- int rc;
-
- dr = get_pci_dr(pdev);
- if (unlikely(!dr))
- return -ENOMEM;
-
- rc = pci_enable_device(pdev);
- if (!rc)
- pdev->is_managed = 1;
+ devm_add_action(&pdev->dev, pcim_disable_device, pdev);
- return rc;
+ return pci_enable_device(pdev);
}
EXPORT_SYMBOL(pcim_enable_device);
--
2.43.0
When the managed PCI-API was introduced to this driver, it was falsly
assumed that initializing the device with pcim_enable_device() instead
of pci_enable_device() will make all PCI functions managed.
This is wrong and was caused by the quite confusing devres API for PCI:
The function pci_iomap_range() is never managed.
Replace pci_iomap_range() with the actually managed function
pcim_iomap_range().
Additionally, add a call to pcim_request_region() to ensure exclusive
access to BAR 0.
CC: <[email protected]> # v5.10+
Fixes: 8558de401b5f ("drm/vboxvideo: use managed pci functions")
Signed-off-by: Philipp Stanner <[email protected]>
---
drivers/gpu/drm/vboxvideo/vbox_main.c | 24 +++++++++++++-----------
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/vboxvideo/vbox_main.c b/drivers/gpu/drm/vboxvideo/vbox_main.c
index 42c2d8a99509..7f686a0190e6 100644
--- a/drivers/gpu/drm/vboxvideo/vbox_main.c
+++ b/drivers/gpu/drm/vboxvideo/vbox_main.c
@@ -42,12 +42,11 @@ static int vbox_accel_init(struct vbox_private *vbox)
/* Take a command buffer for each screen from the end of usable VRAM. */
vbox->available_vram_size -= vbox->num_crtcs * VBVA_MIN_BUFFER_SIZE;
- vbox->vbva_buffers = pci_iomap_range(pdev, 0,
- vbox->available_vram_size,
- vbox->num_crtcs *
- VBVA_MIN_BUFFER_SIZE);
- if (!vbox->vbva_buffers)
- return -ENOMEM;
+ vbox->vbva_buffers = pcim_iomap_range(
+ pdev, 0, vbox->available_vram_size,
+ vbox->num_crtcs * VBVA_MIN_BUFFER_SIZE);
+ if (IS_ERR(vbox->vbva_buffers))
+ return PTR_ERR(vbox->vbva_buffers);
for (i = 0; i < vbox->num_crtcs; ++i) {
vbva_setup_buffer_context(&vbox->vbva_info[i],
@@ -115,12 +114,15 @@ int vbox_hw_init(struct vbox_private *vbox)
DRM_INFO("VRAM %08x\n", vbox->full_vram_size);
+ ret = pcim_request_region(pdev, 0, "vboxvideo");
+ if (ret)
+ return ret;
+
/* Map guest-heap at end of vram */
- vbox->guest_heap =
- pci_iomap_range(pdev, 0, GUEST_HEAP_OFFSET(vbox),
- GUEST_HEAP_SIZE);
- if (!vbox->guest_heap)
- return -ENOMEM;
+ vbox->guest_heap = pcim_iomap_range(pdev, 0,
+ GUEST_HEAP_OFFSET(vbox), GUEST_HEAP_SIZE);
+ if (IS_ERR(vbox->guest_heap))
+ return PTR_ERR(vbox->guest_heap);
/* Create guest-heap mem-pool use 2^4 = 16 byte chunks */
vbox->guest_pool = devm_gen_pool_create(vbox->ddev.dev, 4, -1,
--
2.43.0
The PCI region-request functions become managed functions when
pcim_enable_device() has been called previously instead of
pci_enable_device().
This has already caused bugs by confusing users, who came to believe
that all pci functions, such as pci_iomap(), suddenly are managed that
way.
This is not the case.
Add comments to the relevant functions' docstrings that warn users about
this behavior.
Signed-off-by: Philipp Stanner <[email protected]>
---
drivers/pci/iomap.c | 18 ++++++++++++++
drivers/pci/pci.c | 60 ++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 77 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/iomap.c b/drivers/pci/iomap.c
index b7faf22ec8f5..0b7fcc2f7a49 100644
--- a/drivers/pci/iomap.c
+++ b/drivers/pci/iomap.c
@@ -23,6 +23,11 @@
*
* @maxlen specifies the maximum length to map. If you want to get access to
* the complete BAR from offset to the end, pass %0 here.
+ *
+ * NOTE:
+ * This function is never managed, even if you initialized with
+ * pcim_enable_device().
+ * If you need automatic cleanup, use pcim_iomap_range().
* */
void __iomem *pci_iomap_range(struct pci_dev *dev,
int bar,
@@ -63,6 +68,10 @@ EXPORT_SYMBOL(pci_iomap_range);
*
* @maxlen specifies the maximum length to map. If you want to get access to
* the complete BAR from offset to the end, pass %0 here.
+ *
+ * NOTE:
+ * This function is never managed, even if you initialized with
+ * pcim_enable_device().
* */
void __iomem *pci_iomap_wc_range(struct pci_dev *dev,
int bar,
@@ -106,6 +115,11 @@ EXPORT_SYMBOL_GPL(pci_iomap_wc_range);
*
* @maxlen specifies the maximum length to map. If you want to get access to
* the complete BAR without checking for its length first, pass %0 here.
+ *
+ * NOTE:
+ * This function is never managed, even if you initialized with
+ * pcim_enable_device().
+ * If you need automatic cleanup, use pcim_iomap().
* */
void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
{
@@ -127,6 +141,10 @@ EXPORT_SYMBOL(pci_iomap);
*
* @maxlen specifies the maximum length to map. If you want to get access to
* the complete BAR without checking for its length first, pass %0 here.
+ *
+ * NOTE:
+ * This function is never managed, even if you initialized with
+ * pcim_enable_device().
* */
void __iomem *pci_iomap_wc(struct pci_dev *dev, int bar, unsigned long maxlen)
{
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2899df77a285..7ca860acf351 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3907,6 +3907,8 @@ EXPORT_SYMBOL(pci_release_region);
* @res_name: Name to be associated with resource.
* @exclusive: whether the region access is exclusive or not
*
+ * Returns: 0 on success, negative error code on failure.
+ *
* Mark the PCI region associated with PCI device @pdev BAR @bar as
* being reserved by owner @res_name. Do not access any
* address inside the PCI regions unless this call returns
@@ -3918,6 +3920,13 @@ EXPORT_SYMBOL(pci_release_region);
*
* Returns 0 on success, or %EBUSY on error. A warning
* message is also printed on failure.
+ *
+ * NOTE:
+ * This is a "hybrid" function: Its normally unmanaged, but becomes managed
+ * when pcim_enable_device() has been called in advance.
+ * This hybrid feature is DEPRECATED! If you need to implement a new pci
+ * function that does automatic cleanup, write a new pcim_* function that uses
+ * devres directly.
*/
static int __pci_request_region(struct pci_dev *pdev, int bar,
const char *res_name, int exclusive)
@@ -3966,6 +3975,8 @@ static int __pci_request_region(struct pci_dev *pdev, int bar,
* @bar: BAR to be reserved
* @res_name: Name to be associated with resource
*
+ * Returns: 0 on success, negative error code on failure.
+ *
* Mark the PCI region associated with PCI device @pdev BAR @bar as
* being reserved by owner @res_name. Do not access any
* address inside the PCI regions unless this call returns
@@ -3973,6 +3984,12 @@ static int __pci_request_region(struct pci_dev *pdev, int bar,
*
* Returns 0 on success, or %EBUSY on error. A warning
* message is also printed on failure.
+ *
+ * NOTE:
+ * This is a "hybrid" function: Its normally unmanaged, but becomes managed
+ * when pcim_enable_device() has been called in advance.
+ * This hybrid feature is DEPRECATED! If you want managed cleanup, use the
+ * pcim_* functions instead.
*/
int pci_request_region(struct pci_dev *pdev, int bar, const char *res_name)
{
@@ -3998,6 +4015,13 @@ void pci_release_selected_regions(struct pci_dev *pdev, int bars)
}
EXPORT_SYMBOL(pci_release_selected_regions);
+/*
+ * NOTE:
+ * This is a "hybrid" function: Its normally unmanaged, but becomes managed
+ * when pcim_enable_device() has been called in advance.
+ * This hybrid feature is DEPRECATED! If you want managed cleanup, use the
+ * pcim_* functions instead.
+ */
static int __pci_request_selected_regions(struct pci_dev *pdev, int bars,
const char *res_name, int excl)
{
@@ -4023,6 +4047,14 @@ static int __pci_request_selected_regions(struct pci_dev *pdev, int bars,
* @pdev: PCI device whose resources are to be reserved
* @bars: Bitmask of BARs to be requested
* @res_name: Name to be associated with resource
+ *
+ * Returns: 0 on success, negative error code on failure.
+ *
+ * NOTE:
+ * This is a "hybrid" function: Its normally unmanaged, but becomes managed
+ * when pcim_enable_device() has been called in advance.
+ * This hybrid feature is DEPRECATED! If you want managed cleanup, use the
+ * pcim_* functions instead.
*/
int pci_request_selected_regions(struct pci_dev *pdev, int bars,
const char *res_name)
@@ -4031,6 +4063,20 @@ int pci_request_selected_regions(struct pci_dev *pdev, int bars,
}
EXPORT_SYMBOL(pci_request_selected_regions);
+/**
+ * pci_request_selected_regions_exclusive - Request regions exclusively
+ * @pdev: PCI device to request regions from
+ * @bars: bit mask of bars to request
+ * @res_name: name to be associated with the requests
+ *
+ * Returns: 0 on success, negative error code on failure.
+ *
+ * NOTE:
+ * This is a "hybrid" function: Its normally unmanaged, but becomes managed
+ * when pcim_enable_device() has been called in advance.
+ * This hybrid feature is DEPRECATED! If you want managed cleanup, use the
+ * pcim_* functions instead.
+ */
int pci_request_selected_regions_exclusive(struct pci_dev *pdev, int bars,
const char *res_name)
{
@@ -4048,7 +4094,6 @@ EXPORT_SYMBOL(pci_request_selected_regions_exclusive);
* successful call to pci_request_regions(). Call this function only
* after all use of the PCI regions has ceased.
*/
-
void pci_release_regions(struct pci_dev *pdev)
{
pci_release_selected_regions(pdev, (1 << PCI_STD_NUM_BARS) - 1);
@@ -4080,6 +4125,8 @@ EXPORT_SYMBOL(pci_request_regions);
* @pdev: PCI device whose resources are to be reserved
* @res_name: Name to be associated with resource.
*
+ * Returns: 0 on success, negative error code on failure.
+ *
* Mark all PCI regions associated with PCI device @pdev as being reserved
* by owner @res_name. Do not access any address inside the PCI regions
* unless this call returns successfully.
@@ -4089,6 +4136,12 @@ EXPORT_SYMBOL(pci_request_regions);
*
* Returns 0 on success, or %EBUSY on error. A warning message is also
* printed on failure.
+ *
+ * NOTE:
+ * This is a "hybrid" function: Its normally unmanaged, but becomes managed
+ * when pcim_enable_device() has been called in advance.
+ * This hybrid feature is DEPRECATED! If you want managed cleanup, use the
+ * pcim_* functions instead.
*/
int pci_request_regions_exclusive(struct pci_dev *pdev, const char *res_name)
{
@@ -4420,6 +4473,11 @@ void pci_disable_parity(struct pci_dev *dev)
* @enable: boolean: whether to enable or disable PCI INTx
*
* Enables/disables PCI INTx for device @pdev
+ *
+ * NOTE:
+ * This is a "hybrid" function: Its normally unmanaged, but becomes managed
+ * when pcim_enable_device() has been called in advance.
+ * This hybrid feature is DEPRECATED!
*/
void pci_intx(struct pci_dev *pdev, int enable)
{
--
2.43.0
On Mon, Jan 15, 2024 at 03:46:11PM +0100, Philipp Stanner wrote:
> ...
> pci: add new set of devres functions
> pci: deprecate iomap-table functions
> pci: warn users about complicated devres nature
> pci: devres: make devres region requests consistent
> pci: move enabled status bit to pci_dev struct
> pci: move pinned status bit to pci_dev struct
> pci: devres: give mwi its own callback
> pci: devres: give pci(m)_intx its own callback
> pci: devres: remove legacy pcim_release()
> drm/vboxvideo: fix mapping leaks
If/when you update these, take a look at the drivers/pci/ subject line
history and capitalize these to match.
We haven't really used the "devres" prefix in drivers/pci.
The de facto convention is:
- "PCI/AER:" for major features defined by the PCIe base spec (AER,
DPC, VGA, ASPM, PM, etc).
- "PCI: iproc:" for controller drivers (iproc, dwc, qcom, mvebu,
etc).
- "PCI:" for everything else
On Mon, Jan 15, 2024 at 03:46:18PM +0100, Philipp Stanner wrote:
> managing mwi with devres can easily be done with its own callback,
> without the necessity to store any state about it in a device-related
> struct.
s/managing/Managing/
s/mwi/MWI/ since this is an initialism. Also in subject (but would be
even better if it could mention an actual function name).
> Remove the mwi state from the devres-struct.
> Make the devres-mwi functions set a separate devres-callback.
Wrap to fill 75 columns (or add blank lines if you intend multiple
paragraphs).
s/devres-struct/struct pci_devres/ to make this greppable.
s/the devres-mwi functions/pcim_set_mwi()/ similarly.
> Signed-off-by: Philipp Stanner <[email protected]>
> ---
> drivers/pci/devres.c | 23 +++++++++++------------
> drivers/pci/pci.h | 1 -
> 2 files changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/pci/devres.c b/drivers/pci/devres.c
> index de8cf6f87dd7..911c2037d9fd 100644
> --- a/drivers/pci/devres.c
> +++ b/drivers/pci/devres.c
> @@ -378,24 +378,26 @@ void __iomem *devm_pci_remap_cfg_resource(struct device *dev,
> }
> EXPORT_SYMBOL(devm_pci_remap_cfg_resource);
>
> +static void __pcim_clear_mwi(void *pdev_raw)
> +{
> + struct pci_dev *pdev = pdev_raw;
> +
> + pci_clear_mwi(pdev);
> +}
> +
> /**
> * pcim_set_mwi - a device-managed pci_set_mwi()
> - * @dev: the PCI device for which MWI is enabled
> + * @pdev: the PCI device for which MWI is enabled
> *
> * Managed pci_set_mwi().
> *
> * RETURNS: An appropriate -ERRNO error value on error, or zero for success.
> */
> -int pcim_set_mwi(struct pci_dev *dev)
> +int pcim_set_mwi(struct pci_dev *pdev)
> {
> - struct pci_devres *dr;
> -
> - dr = find_pci_dr(dev);
> - if (!dr)
> - return -ENOMEM;
> + devm_add_action(&pdev->dev, __pcim_clear_mwi, pdev);
>
> - dr->mwi = 1;
> - return pci_set_mwi(dev);
> + return pci_set_mwi(pdev);
> }
> EXPORT_SYMBOL(pcim_set_mwi);
>
> @@ -405,9 +407,6 @@ static void pcim_release(struct device *gendev, void *res)
> struct pci_dev *dev = to_pci_dev(gendev);
> struct pci_devres *this = res;
>
> - if (this->mwi)
> - pci_clear_mwi(dev);
> -
> if (this->restore_intx)
> pci_intx(dev, this->orig_intx);
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 3d9908a69ebf..667bfdd61d5e 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -811,7 +811,6 @@ static inline pci_power_t mid_pci_get_power_state(struct pci_dev *pdev)
> struct pci_devres {
> unsigned int orig_intx:1;
> unsigned int restore_intx:1;
> - unsigned int mwi:1;
> };
>
> struct pci_devres *find_pci_dr(struct pci_dev *pdev);
> --
> 2.43.0
>
On Mon, Jan 15, 2024 at 03:46:12PM +0100, Philipp Stanner wrote:
> PCI's devres API is not extensible to ranged mappings and has
> bug-provoking features. Improve that by providing better alternatives.
I guess "ranged mappings" means a mapping that doesn't cover an entire
BAR? Maybe there's a way to clarify?
> When the original devres API for PCI was implemented, priority was given
> to the creation of a set of "pural functions" such as
> pcim_request_regions(). These functions have bit masks as parameters to
> specify which BARs shall get mapped. Most users, however, only use those
> to mapp 1-3 BARs.
> A complete set of "singular functions" does not exist.
s/mapp/map/
Rewrap to fill 75 columns or add blank lines between paragraphs. Also
below.
> As functions mapping / requesting multiple BARs at once have (almost) no
> mechanism in C to return the resources to the caller of the plural
> function, the devres API utilizes the iomap-table administrated by the
> function pcim_iomap_table().
>
> The entire PCI devres implementation was strongly tied to that table
> which only allows for mapping whole, complete BARs, as the BAR's index
> is used as table index. Consequently, it's not possible to, e.g., have a
> pcim_iomap_range() function with that mechanism.
>
> An additional problem is that pci-devres has been ipmlemented in a sort
> of "hybrid-mode": Some unmanaged functions have managed counterparts
> (e.g.: pci_iomap() <-> pcim_iomap()), making their managed nature
> obvious to the programmer. However, the region-request functions in
> pci.c, prefixed with pci_, behave either managed or unmanaged, depending
> on whether pci_enable_device() or pcim_enable_device() has been called
> in advance.
s/ipmlemented/implemented/
> This hybrid API is confusing and should be more cleanly separated by
> providing always-managed functions prefixed with pcim_.
>
> Thus, the existing devres API is not desirable because:
> a) The vast majority of the users of the plural functions only
> ever sets a single bit in the bit mask, consequently making
> them singular functions anyways.
> b) There is no mechanism to request / iomap only part of a BAR.
> c) The iomap-table mechanism is over-engineered, complicated and
> can by definition not perform bounds checks, thus, provoking
> memory faults: pcim_iomap_table(pdev)[42]
Not sure what "pcim_iomap_table(pdev)[42]" means.
> d) region-request functions being sometimes managed and
> sometimes not is bug-provoking.
Indent with spaces (not tabs) so it still looks good when "git log"
adds spaces to indent.
> + * Legacy struct storing addresses to whole mapped bars.
s/bar/BAR/ (several places).
> + /* A region spaning an entire bar. */
> + PCIM_ADDR_DEVRES_TYPE_REGION,
> +
> + /* A region spaning an entire bar, and a mapping for that whole bar. */
> + PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING,
> +
> + /*
> + * A mapping within a bar, either spaning the whole bar or just a range.
> + * Without a requested region.
s/spaning/spanning/ (several places).
> + if (start == 0 || len == 0) /* that's an unused BAR. */
s/that/That/
> + /*
> + * Ranged mappings don't get added to the legacy-table, since the table
> + * only ever keeps track of whole BARs.
> + */
> +
Spurious blank line.
> + devres_add(&pdev->dev, res);
> + return mapping;
> +}
> +EXPORT_SYMBOL(pcim_iomap_range);
Bjorn
Mon, Jan 15, 2024 at 03:46:15PM +0100, Philipp Stanner kirjoitti:
> Now that pure managed region request functions are available, the
> implementation of the hybrid-functions which are only sometimes managed
> can be made more consistent and readable by wrapping those
> always-managed functions.
>
> Implement a new pcim_ function for exclusively requested regions.
> Have the pci_request / release functions call their pcim_ counterparts.
> Remove the now surplus region_mask from the devres struct.
..
> + if (pci_is_managed(pdev)) {
> + if (exclusive == IORESOURCE_EXCLUSIVE)
> + return pcim_request_region_exclusive(pdev, bar, res_name);
> + else
Redundant 'else'
> + return pcim_request_region(pdev, bar, res_name);
> + }
--
With Best Regards,
Andy Shevchenko
Mon, Jan 15, 2024 at 03:46:13PM +0100, Philipp Stanner kirjoitti:
> The old plural devres functions tie PCI's devres implementation to the
> iomap-table mechanism which unfortunately is not extensible.
>
> As the purlal functions are almost never used with more than one bit set
> in their bit-mask, deprecating them and encouraging users to use the new
> singular functions instead is reasonable.
>
> Furthermore, to make the implementation more consistent and extensible,
> the plural functions should use the singular functions.
>
> Add new wrapper to request / release all BARs.
> Make the plural functions call into the singular functions.
> Mark the plural functions as deprecated.
> Remove as much of the iomap-table-mechanism as possible.
> Add comments describing the path towards a cleaned-up API.
..
> static void pcim_iomap_release(struct device *gendev, void *res)
> {
> - struct pci_dev *dev = to_pci_dev(gendev);
> - struct pcim_iomap_devres *this = res;
> - int i;
> -
> - for (i = 0; i < PCIM_IOMAP_MAX; i++)
> - if (this->table[i])
> - pci_iounmap(dev, this->table[i]);
> + /*
> + * Do nothing. This is legacy code.
> + *
> + * Cleanup of the mappings is now done directly through the callbacks
> + * registered when creating them.
> + */
> }
How many users we have? Can't we simply kill it for good?
..
> + * This function is DEPRECATED. Do not use it in new code.
We have __deprecated IIRC, can it be used?
..
> + if (pcim_add_mapping_to_legacy_table(pdev, mapping, bar) != 0)
Redundant ' != 0' part.
> + goto err_table;
..
> +static inline bool mask_contains_bar(int mask, int bar)
> +{
> + return mask & (1 << bar);
BIT() ?
> +}
But I believe this function is not needed (see below).
..
> /**
> - * pcim_iomap_regions - Request and iomap PCI BARs
> + * pcim_iomap_regions - Request and iomap PCI BARs (DEPRECATED)
> * @pdev: PCI device to map IO resources for
> * @mask: Mask of BARs to request and iomap
> * @name: Name associated with the requests
> *
> + * Returns 0 on success, negative error code on failure.
Validate the kernel-doc.
> * Request and iomap regions specified by @mask.
> + *
> + * This function is DEPRECATED. Don't use it in new code.
> + * Use pcim_iomap_region() instead.
> */
..
> + for (bar = 0; bar < DEVICE_COUNT_RESOURCE; bar++) {
> + if (!mask_contains_bar(mask, bar))
> + continue;
NIH for_each_set_bit().
..
> + ret = pcim_add_mapping_to_legacy_table(pdev, mapping, bar);
> + if (ret != 0)
if (ret)
> + goto err;
> + }
..
> + err:
Better to name it like
err_unmap_and_remove:
> + while (--bar >= 0) {
while (bar--)
is easier to read.
> + pcim_iounmap_region(pdev, bar);
> + pcim_remove_bar_from_legacy_table(pdev, bar);
> + }
..
> +/**
> + * pcim_request_all_regions - Request all regions
> + * @pdev: PCI device to map IO resources for
> + * @name: name associated with the request
> + *
> + * Requested regions will automatically be released at driver detach. If desired,
> + * release individual regions with pcim_release_region() or all of them at once
> + * with pcim_release_all_regions().
> + */
Validate kernel-doc.
..
> + ret = pcim_request_region(pdev, bar, name);
> + if (ret != 0)
if (ret)
> + goto err;
..
> + short bar;
Why signed?
> + int ret = -1;
Oy vei!
..
> + ret = pcim_request_all_regions(pdev, name);
> + if (ret != 0)
Here and in plenty other places
if (ret)
> + return ret;
> + for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> + if (!mask_contains_bar(mask, bar))
> + continue;
NIH for_each_set_bit()
> + if (!pcim_iomap(pdev, bar, 0))
> + goto err;
> + }
..
> + if (!legacy_iomap_table)
What's wrong with positive check?
> + ret = -ENOMEM;
> + else
> + ret = -EINVAL;
Can be even one liner
What's wrong with positive check?
ret = legacy_iomap_table ? -EINVAL : -ENOMEM;
..
> + while (--bar >= 0)
while (bar--)
> + pcim_iounmap(pdev, legacy_iomap_table[bar]);
..
> + for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> + if (!mask_contains_bar(mask, bar))
> continue;
NIH for_each_set_bit()
> - pcim_iounmap(pdev, iomap[i]);
> - pci_release_region(pdev, i);
> + pcim_iounmap_region(pdev, bar);
> + pcim_remove_bar_from_legacy_table(pdev, bar);
> }
--
With Best Regards,
Andy Shevchenko
Mon, Jan 15, 2024 at 03:46:17PM +0100, Philipp Stanner kirjoitti:
> The bit describing whether the PCI device is currently pinned is stored
> in the PCI devres struct. To clean up and simplify the pci-devres API,
"PCI devres", 'pci-devres', ...
Shouldn't these (and across entire series) be consistent terms?
E.g., "PCI devres API".
> it's better if this information is stored in the pci_dev struct, because
pci_dev struct --> struct pci_dev
> it allows for checking that device's pinned-status directly through the
> device struct.
> This will later permit simplifying pcim_enable_device().
> Move the 'pinned' boolean bit to struct pci_dev.
..
> u8 pm_cap; /* PM capability offset */
> unsigned int enabled:1; /* Whether this dev is enabled */
> + unsigned int pinned:1; /* Whether this dev is pinned */
> unsigned int imm_ready:1; /* Supports Immediate Readiness */
> unsigned int pme_support:5; /* Bitmask of states from which PME#
> can be generated */
First of all, I think it's better to group PM stuff, like
u8 pm_cap; /* PM capability offset */
unsigned int pme_support:5; /* Bitmask of states from which PME#
can be generated */
unsigned int imm_ready:1; /* Supports Immediate Readiness */
unsigned int enabled:1; /* Whether this dev is enabled */
unsigned int pinned:1; /* Whether this dev is pinned */
Second, does this layout anyhow related to the HW layout? (For example,
PME bits and their location in some HW register vs. these bitfields)
If so, but not sure, it might be good to preserve (to some extent) the
order.
--
With Best Regards,
Andy Shevchenko
Mon, Jan 15, 2024 at 03:46:19PM +0100, Philipp Stanner kirjoitti:
> pci_intx() is one of the functions that have "hybrid mode" (i.e.,
> sometimes managed, sometimes not). Providing a separate pcim_intx()
> function with its own device resource and cleanup callback allows for
> removing further large parts of the legacy pci-devres implementation.
>
> As in the region-request-functions, pci_intx() has to call into its
> managed counterpart for backwards compatibility.
>
> Implement pcim_intx() with its own device resource.
> Make pci_intx() call pcim_intx() in the managed case.
> Remove the legacy devres struct from pci.h.
..
> + /*
> + * This is done for backwards compatibility, because the old pci-devres
> + * API had a mode in which this function became managed if the dev had
> + * been enabled with pcim_enable_device() instead of pci_enable_device().
> + */
> + if (pci_is_managed(pdev)) {
> + if (pcim_intx(pdev, enable) != 0)
> + WARN_ON_ONCE(1);
WARN_ON_ONCE(pcim_intx(pdev, enable));
?
> + return;
> + }
..
> + if (new != pci_command)
if (new == pci_command)
return;
?
> pci_write_config_word(pdev, PCI_COMMAND, new);
--
With Best Regards,
Andy Shevchenko
Mon, Jan 15, 2024 at 03:46:20PM +0100, Philipp Stanner kirjoitti:
> Thanks to preceding cleanup steps, pcim_release() is now not needed
> anymore and can be replaced by pcim_disable_device(), which is the exact
> counterpart to pcim_enable_device().
> This permits removing further parts of the old devres API.
>
> Replace pcim_release() with pcim_disable_device().
> Remove the now surplus get_dr() function.
..
> + devm_add_action(&pdev->dev, pcim_disable_device, pdev);
No error check?
> + return pci_enable_device(pdev);
Maybe
ret = pci_enable_device(...);
if (ret)
return ret;
return devm_add_action_or_reset(...)?
I could think of side effects of this, so perhaps the commit message and/or
code needs a comment on why the above proposal can _not_ be used?
--
With Best Regards,
Andy Shevchenko
Mon, Jan 15, 2024 at 03:46:12PM +0100, Philipp Stanner kirjoitti:
> PCI's devres API is not extensible to ranged mappings and has
> bug-provoking features. Improve that by providing better alternatives.
>
> When the original devres API for PCI was implemented, priority was given
> to the creation of a set of "pural functions" such as
> pcim_request_regions(). These functions have bit masks as parameters to
> specify which BARs shall get mapped. Most users, however, only use those
> to mapp 1-3 BARs.
> A complete set of "singular functions" does not exist.
>
> As functions mapping / requesting multiple BARs at once have (almost) no
> mechanism in C to return the resources to the caller of the plural
> function, the devres API utilizes the iomap-table administrated by the
> function pcim_iomap_table().
>
> The entire PCI devres implementation was strongly tied to that table
> which only allows for mapping whole, complete BARs, as the BAR's index
> is used as table index. Consequently, it's not possible to, e.g., have a
> pcim_iomap_range() function with that mechanism.
>
> An additional problem is that pci-devres has been ipmlemented in a sort
> of "hybrid-mode": Some unmanaged functions have managed counterparts
> (e.g.: pci_iomap() <-> pcim_iomap()), making their managed nature
> obvious to the programmer. However, the region-request functions in
> pci.c, prefixed with pci_, behave either managed or unmanaged, depending
> on whether pci_enable_device() or pcim_enable_device() has been called
> in advance.
>
> This hybrid API is confusing and should be more cleanly separated by
> providing always-managed functions prefixed with pcim_.
>
> Thus, the existing devres API is not desirable because:
> a) The vast majority of the users of the plural functions only
> ever sets a single bit in the bit mask, consequently making
> them singular functions anyways.
> b) There is no mechanism to request / iomap only part of a BAR.
> c) The iomap-table mechanism is over-engineered, complicated and
> can by definition not perform bounds checks, thus, provoking
> memory faults: pcim_iomap_table(pdev)[42]
> d) region-request functions being sometimes managed and
> sometimes not is bug-provoking.
>
> Implement a set of singular pcim_ functions that use devres directly
> and bypass the legacy iomap table mechanism.
> Add devres.c to driver-api documentation.
..
> +struct pcim_addr_devres {
> + enum pcim_addr_devres_type type;
> + void __iomem *baseaddr;
> + unsigned long offset;
> + unsigned long len;
> + short bar;
> +};
> +
> +static inline void pcim_addr_devres_clear(struct pcim_addr_devres *res)
> +{
> + res->type = PCIM_ADDR_DEVRES_TYPE_INVALID;
> + res->bar = -1;
> + res->baseaddr = NULL;
> + res->offset = 0;
> + res->len = 0;
More robust (in case the data type gets extended) is memset() + individual
(non-0) sets.
> +}
..
> +static int __pcim_request_region_range(struct pci_dev *pdev, int bar,
> + unsigned long offset, unsigned long maxlen,
> + const char *name, int exclusive)
> +{
> + resource_size_t start = pci_resource_start(pdev, bar);
> + resource_size_t len = pci_resource_len(pdev, bar);
> + unsigned long flags = pci_resource_flags(pdev, bar);
> +
> + if (start == 0 || len == 0) /* that's an unused BAR. */
> + return 0;
> + if (len <= offset)
> + return -EINVAL;
> +
> + start += offset;
> + len -= offset;
> + if (len > maxlen && maxlen != 0)
> + len = maxlen;
if (maxlen && ...)
?
> + if (flags & IORESOURCE_IO) {
> + if (!request_region(start, len, name))
> + return -EBUSY;
> + } else if (flags & IORESOURCE_MEM) {
> + if (!__request_mem_region(start, len, name, exclusive))
> + return -EBUSY;
> + } else {
> + /* That's not a device we can request anything on. */
> + return -ENODEV;
> + }
Hmm... Not sure, but the switch-case against type might be considered:
switch (resource_type(...)) {
...
}
> + return 0;
> +}
> +static void __pcim_release_region_range(struct pci_dev *pdev, int bar,
> + unsigned long offset, unsigned long maxlen)
> +{
> + resource_size_t start = pci_resource_start(pdev, bar);
> + resource_size_t len = pci_resource_len(pdev, bar);
> + unsigned long flags = pci_resource_flags(pdev, bar);
> +
> + if (len <= offset || start == 0)
> + return;
> +
> + if (len == 0 || maxlen == 0) /* This an unused BAR. Do nothing. */
> + return;
> +
> + start += offset;
> + len -= offset;
> +
> + if (len > maxlen)
> + len = maxlen;
This part is quite a duplication of the above function, no?
> + if (flags & IORESOURCE_IO)
> + release_region(start, len);
> + else if (flags & IORESOURCE_MEM)
> + release_mem_region(start, len);
> +}
..
> +static int __pcim_request_region(struct pci_dev *pdev, int bar,
> + const char *name, int exclusive)
> +{
> + const unsigned long offset = 0;
> + const unsigned long len = pci_resource_len(pdev, bar);
How const anyhow useful here?
Ditto for other places like this.
> + return __pcim_request_region_range(pdev, bar, offset, len, name, exclusive);
> +}
..
> +static int pcim_addr_resources_match(struct device *dev, void *a_raw, void *b_raw)
> +{
> + struct pcim_addr_devres *a, *b;
> +
> + a = a_raw;
> + b = b_raw;
> + (void)dev; /* unused. */
Why do we need this?
> + if (a->type != b->type)
> + return 0;
> +
> + switch (a->type) {
> + case PCIM_ADDR_DEVRES_TYPE_REGION:
> + case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
> + return a->bar == b->bar;
> + case PCIM_ADDR_DEVRES_TYPE_MAPPING:
> + return a->baseaddr == b->baseaddr;
> + case PCIM_ADDR_DEVRES_TYPE_REGION_RANGE_MAPPING:
> + return a->bar == b->bar &&
> + a->offset == b->offset && a->len == b->len;
Indentation or made it a single line.
> + default:
> + break;
> + }
> +
> + return 0;
return directly from default case.
> +}
..
> +/**
> + * pcim_iomap_region - Request and iomap a PCI BAR
> + * @pdev: PCI device to map IO resources for
> + * @bar: Index of a BAR to map
> + * @name: Name associated with the request
> + *
> + * Returns __iomem pointer on success, an IOMEM_ERR_PTR on failure.
Please, make sure the kernel-doc won't complain
scripts/kernel-doc -v -none -Wall ...
> + * Mapping and region will get automatically released on driver detach. If
> + * desired, release manually only with pcim_iounmap_region().
> + */
> +void __iomem *pcim_iomap_region(struct pci_dev *pdev, int bar, const char *name)
> +{
> + int ret = 0;
Redundant assignment.
> + struct pcim_addr_devres *res;
Perhaps reversed xmas tree order?
> + res = pcim_addr_devres_alloc(pdev);
> + if (!res)
> + return IOMEM_ERR_PTR(-ENOMEM);
> +
> + res->type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
> + res->bar = bar;
> +
> + ret = __pcim_request_region(pdev, bar, name, 0);
> + if (ret != 0)
> + goto err_region;
> +
> + res->baseaddr = pci_iomap(pdev, bar, 0);
> + if (!res->baseaddr) {
> + ret = -EINVAL;
> + goto err_iomap;
> + }
> +
> + devres_add(&pdev->dev, res);
> + return res->baseaddr;
> +
> +err_iomap:
> + __pcim_release_region(pdev, bar);
> +err_region:
> + pcim_addr_devres_free(res);
> +
> + return IOMEM_ERR_PTR(ret);
> +}
..
> +static int _pcim_request_region(struct pci_dev *pdev, int bar, const char *name,
> + int request_flags)
Indentation?
> +{
> + int ret = 0;
Unneded assignment. Also fix this in other places.
> + struct pcim_addr_devres *res;
> +
> + res = pcim_addr_devres_alloc(pdev);
> + if (!res)
> + return -ENOMEM;
> + res->type = PCIM_ADDR_DEVRES_TYPE_REGION;
> + res->bar = bar;
> +
> + ret = __pcim_request_region(pdev, bar, name, request_flags);
> + if (ret != 0) {
if (ret)
Also fix this in other places.
> + pcim_addr_devres_free(res);
> + return ret;
> + }
> +
> + devres_add(&pdev->dev, res);
> + return 0;
> +}
--
With Best Regards,
Andy Shevchenko
On Tue, 2024-01-16 at 12:44 -0600, Bjorn Helgaas wrote:
> On Mon, Jan 15, 2024 at 03:46:12PM +0100, Philipp Stanner wrote:
> > PCI's devres API is not extensible to ranged mappings and has
> > bug-provoking features. Improve that by providing better
> > alternatives.
>
> I guess "ranged mappings" means a mapping that doesn't cover an
> entire
> BAR? Maybe there's a way to clarify?
That's what it's supposed to mean, yes.
We could give it the longer title "mappings smaller than the whole BAR"
or something, I guess.
>
> > When the original devres API for PCI was implemented, priority was
> > given
> > to the creation of a set of "pural functions" such as
> > pcim_request_regions(). These functions have bit masks as
> > parameters to
> > specify which BARs shall get mapped. Most users, however, only use
> > those
> > to mapp 1-3 BARs.
> > A complete set of "singular functions" does not exist.
>
> s/mapp/map/
>
> Rewrap to fill 75 columns or add blank lines between paragraphs.
> Also
> below.
>
> > As functions mapping / requesting multiple BARs at once have
> > (almost) no
> > mechanism in C to return the resources to the caller of the plural
> > function, the devres API utilizes the iomap-table administrated by
> > the
> > function pcim_iomap_table().
> >
> > The entire PCI devres implementation was strongly tied to that
> > table
> > which only allows for mapping whole, complete BARs, as the BAR's
> > index
> > is used as table index. Consequently, it's not possible to, e.g.,
> > have a
> > pcim_iomap_range() function with that mechanism.
> >
> > An additional problem is that pci-devres has been ipmlemented in a
> > sort
> > of "hybrid-mode": Some unmanaged functions have managed
> > counterparts
> > (e.g.: pci_iomap() <-> pcim_iomap()), making their managed nature
> > obvious to the programmer. However, the region-request functions in
> > pci.c, prefixed with pci_, behave either managed or unmanaged,
> > depending
> > on whether pci_enable_device() or pcim_enable_device() has been
> > called
> > in advance.
>
> s/ipmlemented/implemented/
>
> > This hybrid API is confusing and should be more cleanly separated
> > by
> > providing always-managed functions prefixed with pcim_.
> >
> > Thus, the existing devres API is not desirable because:
> > a) The vast majority of the users of the plural functions
> > only
> > ever sets a single bit in the bit mask, consequently
> > making
> > them singular functions anyways.
> > b) There is no mechanism to request / iomap only part of a
> > BAR.
> > c) The iomap-table mechanism is over-engineered,
> > complicated and
> > can by definition not perform bounds checks, thus,
> > provoking
> > memory faults: pcim_iomap_table(pdev)[42]
>
> Not sure what "pcim_iomap_table(pdev)[42]" means.
That function currently is implemented with this prototype:
void __iomem * const *pcim_iomap_table(struct pci_dev *pdev);
And apparently, it's intended to index directly over the function. And
that's how at least part of the users use it indeed.
Here in drivers/crypto/inside-secure/safexcel.c, L.1919 for example:
priv->base = pcim_iomap_table(pdev)[0];
I've never seen something that wonderful in C ever before, so it's not
surprising that you weren't sure what I mean....
pcim_iomap_table() can not and does not perform any bounds check. If
you do
void __iomem *mappy_map_mapface = pcim_iomap_table(pdev)[42];
then it will just return random garbage, or it faults. No -EINVAL or
anything. You won't even get NULL.
That's why this function must die.
>
> > d) region-request functions being sometimes managed and
> > sometimes not is bug-provoking.
>
> Indent with spaces (not tabs) so it still looks good when "git log"
> adds spaces to indent.
>
> > + * Legacy struct storing addresses to whole mapped bars.
>
> s/bar/BAR/ (several places).
>
> > + /* A region spaning an entire bar. */
> > + PCIM_ADDR_DEVRES_TYPE_REGION,
> > +
> > + /* A region spaning an entire bar, and a mapping for that
> > whole bar. */
> > + PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING,
> > +
> > + /*
> > + * A mapping within a bar, either spaning the whole bar or
> > just a range.
> > + * Without a requested region.
>
> s/spaning/spanning/ (several places).
>
> > + if (start == 0 || len == 0) /* that's an unused BAR. */
>
> s/that/That/
>
> > + /*
> > + * Ranged mappings don't get added to the legacy-table,
> > since the table
> > + * only ever keeps track of whole BARs.
> > + */
> > +
>
> Spurious blank line.
I'll take care of the grammar and spelling stuff in v2.
Thanks,
P.
>
> > + devres_add(&pdev->dev, res);
> > + return mapping;
> > +}
> > +EXPORT_SYMBOL(pcim_iomap_range);
>
> Bjorn
>
On Tue, 2024-01-16 at 23:34 +0200, [email protected] wrote:
> Mon, Jan 15, 2024 at 03:46:17PM +0100, Philipp Stanner kirjoitti:
> > The bit describing whether the PCI device is currently pinned is
> > stored
> > in the PCI devres struct. To clean up and simplify the pci-devres
> > API,
>
> "PCI devres", 'pci-devres', ...
> Shouldn't these (and across entire series) be consistent terms?
> E.g., "PCI devres API".
Yes, we should agree on a canonical term probably.
Probably "PCI devres" is good.
>
> > it's better if this information is stored in the pci_dev struct,
> > because
>
> pci_dev struct --> struct pci_dev
>
> > it allows for checking that device's pinned-status directly through
> > the
> > device struct.
> > This will later permit simplifying pcim_enable_device().
>
> > Move the 'pinned' boolean bit to struct pci_dev.
>
> ...
>
> > u8 pm_cap; /* PM capability offset */
> > unsigned int enabled:1; /* Whether this dev is
> > enabled */
> > + unsigned int pinned:1; /* Whether this dev is
> > pinned */
> > unsigned int imm_ready:1; /* Supports Immediate
> > Readiness */
> > unsigned int pme_support:5; /* Bitmask of states from
> > which PME#
> > can be generated */
>
> First of all, I think it's better to group PM stuff, like
ACK
>
> u8 pm_cap; /* PM capability offset */
> unsigned int pme_support:5; /* Bitmask of states from
> which PME#
> can be generated */
> unsigned int imm_ready:1; /* Supports Immediate
> Readiness */
> unsigned int enabled:1; /* Whether this dev is
> enabled */
> unsigned int pinned:1; /* Whether this dev is pinned
> */
>
> Second, does this layout anyhow related to the HW layout? (For
> example,
> PME bits and their location in some HW register vs. these bitfields)
> If so, but not sure, it might be good to preserve (to some extent)
> the
> order.
As far as I know, one of the greatest weaknesses of C is that neither
Ritchie nor the standard ever said anything about the fields' order.
Hence, every field-order is purely implementation-defined and in no way
portable.
So I can't imagine a bitfield will ever directly mapp to a HW-layout?
The fields order is purely aesthetic / for the human reader.
P.
>
On Tue, 2024-01-16 at 23:15 +0200, [email protected] wrote:
> Mon, Jan 15, 2024 at 03:46:12PM +0100, Philipp Stanner kirjoitti:
> > PCI's devres API is not extensible to ranged mappings and has
> > bug-provoking features. Improve that by providing better
> > alternatives.
> >
> > When the original devres API for PCI was implemented, priority was
> > given
> > to the creation of a set of "pural functions" such as
> > pcim_request_regions(). These functions have bit masks as
> > parameters to
> > specify which BARs shall get mapped. Most users, however, only use
> > those
> > to mapp 1-3 BARs.
> > A complete set of "singular functions" does not exist.
> >
> > As functions mapping / requesting multiple BARs at once have
> > (almost) no
> > mechanism in C to return the resources to the caller of the plural
> > function, the devres API utilizes the iomap-table administrated by
> > the
> > function pcim_iomap_table().
> >
> > The entire PCI devres implementation was strongly tied to that
> > table
> > which only allows for mapping whole, complete BARs, as the BAR's
> > index
> > is used as table index. Consequently, it's not possible to, e.g.,
> > have a
> > pcim_iomap_range() function with that mechanism.
> >
> > An additional problem is that pci-devres has been ipmlemented in a
> > sort
> > of "hybrid-mode": Some unmanaged functions have managed
> > counterparts
> > (e.g.: pci_iomap() <-> pcim_iomap()), making their managed nature
> > obvious to the programmer. However, the region-request functions in
> > pci.c, prefixed with pci_, behave either managed or unmanaged,
> > depending
> > on whether pci_enable_device() or pcim_enable_device() has been
> > called
> > in advance.
> >
> > This hybrid API is confusing and should be more cleanly separated
> > by
> > providing always-managed functions prefixed with pcim_.
> >
> > Thus, the existing devres API is not desirable because:
> > a) The vast majority of the users of the plural functions
> > only
> > ever sets a single bit in the bit mask, consequently
> > making
> > them singular functions anyways.
> > b) There is no mechanism to request / iomap only part of a
> > BAR.
> > c) The iomap-table mechanism is over-engineered,
> > complicated and
> > can by definition not perform bounds checks, thus,
> > provoking
> > memory faults: pcim_iomap_table(pdev)[42]
> > d) region-request functions being sometimes managed and
> > sometimes not is bug-provoking.
> >
> > Implement a set of singular pcim_ functions that use devres
> > directly
> > and bypass the legacy iomap table mechanism.
> > Add devres.c to driver-api documentation.
>
> ...
>
> > +struct pcim_addr_devres {
> > + enum pcim_addr_devres_type type;
> > + void __iomem *baseaddr;
> > + unsigned long offset;
> > + unsigned long len;
> > + short bar;
> > +};
> > +
> > +static inline void pcim_addr_devres_clear(struct pcim_addr_devres
> > *res)
> > +{
> > + res->type = PCIM_ADDR_DEVRES_TYPE_INVALID;
> > + res->bar = -1;
> > + res->baseaddr = NULL;
> > + res->offset = 0;
> > + res->len = 0;
>
> More robust (in case the data type gets extended) is memset() +
> individual
> (non-0) sets.
ACK
>
> > +}
>
> ...
>
> > +static int __pcim_request_region_range(struct pci_dev *pdev, int
> > bar,
> > + unsigned long offset, unsigned long maxlen,
> > + const char *name, int exclusive)
> > +{
> > + resource_size_t start = pci_resource_start(pdev, bar);
> > + resource_size_t len = pci_resource_len(pdev, bar);
> > + unsigned long flags = pci_resource_flags(pdev, bar);
> > +
> > + if (start == 0 || len == 0) /* that's an unused BAR. */
> > + return 0;
> > + if (len <= offset)
> > + return -EINVAL;
> > +
> > + start += offset;
> > + len -= offset;
>
> > + if (len > maxlen && maxlen != 0)
> > + len = maxlen;
>
> if (maxlen && ...)
>
> ?
I very much dislike this style, although I'm aware it's used in many
(but not all) regions of the kernel.
It makes your style inconsistent, because sometimes you do indeed check
for something larger or smaller than 0.
Plus, by checking for a number, everyone immediately sees that this is
an integer, not a pointer, which improves readability at 0 cost.
>
> > + if (flags & IORESOURCE_IO) {
> > + if (!request_region(start, len, name))
> > + return -EBUSY;
> > + } else if (flags & IORESOURCE_MEM) {
> > + if (!__request_mem_region(start, len, name,
> > exclusive))
> > + return -EBUSY;
> > + } else {
> > + /* That's not a device we can request anything on.
> > */
> > + return -ENODEV;
> > + }
>
> Hmm... Not sure, but the switch-case against type might be
> considered:
>
> switch (resource_type(...)) {
> ...
> }
You mean resource_type() from ioport.h? How would that be useful here?
Would you want to write a similar function?
I'd say that switch (res->type) 's meaning is very obvious
>
> > + return 0;
> > +}
>
> > +static void __pcim_release_region_range(struct pci_dev *pdev, int
> > bar,
> > + unsigned long offset, unsigned long maxlen)
> > +{
> > + resource_size_t start = pci_resource_start(pdev, bar);
> > + resource_size_t len = pci_resource_len(pdev, bar);
> > + unsigned long flags = pci_resource_flags(pdev, bar);
> > +
> > + if (len <= offset || start == 0)
> > + return;
> > +
> > + if (len == 0 || maxlen == 0) /* This an unused BAR. Do
> > nothing. */
> > + return;
> > +
> > + start += offset;
> > + len -= offset;
> > +
> > + if (len > maxlen)
> > + len = maxlen;
>
> This part is quite a duplication of the above function, no?
Yes.
I once had a wrapper for that in mind, but such a wrapper also gets
quite complicated quickly. Reason being that you don't just check, you
also modify the parameters.
You'd have sth like
int __pcim_check_adjust_region_range_params(unsigned long *start, unsigned long *len);
and then you'd have to return either -EINVAL or 0 and *check* for those
again in the calling function.
That's why I removed the wrapper again and just copied the code,
because I thought that's cheaper, ultimately.
>
> > + if (flags & IORESOURCE_IO)
> > + release_region(start, len);
> > + else if (flags & IORESOURCE_MEM)
> > + release_mem_region(start, len);
> > +}
>
> ...
>
> > +static int __pcim_request_region(struct pci_dev *pdev, int bar,
> > + const char *name, int exclusive)
> > +{
> > + const unsigned long offset = 0;
> > + const unsigned long len = pci_resource_len(pdev, bar);
>
> How const anyhow useful here?
> Ditto for other places like this.
Yeah, we can omit those
>
> > + return __pcim_request_region_range(pdev, bar, offset, len,
> > name, exclusive);
> > +}
>
> ...
>
> > +static int pcim_addr_resources_match(struct device *dev, void
> > *a_raw, void *b_raw)
> > +{
> > + struct pcim_addr_devres *a, *b;
> > +
> > + a = a_raw;
> > + b = b_raw;
>
> > + (void)dev; /* unused. */
>
> Why do we need this?
Old instinct from another project where the compiler punched you for
unused variables and function parameters.
Can remove it.
>
> > + if (a->type != b->type)
> > + return 0;
> > +
> > + switch (a->type) {
> > + case PCIM_ADDR_DEVRES_TYPE_REGION:
> > + case PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING:
> > + return a->bar == b->bar;
> > + case PCIM_ADDR_DEVRES_TYPE_MAPPING:
> > + return a->baseaddr == b->baseaddr;
> > + case PCIM_ADDR_DEVRES_TYPE_REGION_RANGE_MAPPING:
> > + return a->bar == b->bar &&
> > + a->offset == b->offset && a->len == b->len;
>
> Indentation or made it a single line.
How do you want such an indentation to be performed. Tabs mixed with
spaces?
>
> > + default:
> > + break;
> > + }
> > +
> > + return 0;
>
> return directly from default case.
>
> > +}
>
> ...
>
> > +/**
> > + * pcim_iomap_region - Request and iomap a PCI BAR
> > + * @pdev: PCI device to map IO resources for
> > + * @bar: Index of a BAR to map
> > + * @name: Name associated with the request
> > + *
> > + * Returns __iomem pointer on success, an IOMEM_ERR_PTR on
> > failure.
>
> Please, make sure the kernel-doc won't complain
>
> scripts/kernel-doc -v -none -Wall ...
I'll have a look
>
> > + * Mapping and region will get automatically released on driver
> > detach. If
> > + * desired, release manually only with pcim_iounmap_region().
> > + */
> > +void __iomem *pcim_iomap_region(struct pci_dev *pdev, int bar,
> > const char *name)
> > +{
> > + int ret = 0;
>
> Redundant assignment.
I guess we can remove it, but do you think it's not just useless, but
actually bad?
After all, people like the Rust folks frequently complain about the
'problem' in C of variables not being initialized.
I'm neutral about this, we can keep or remove it.
>
> > + struct pcim_addr_devres *res;
>
> Perhaps reversed xmas tree order?
What do you mean? The struct's name? The function's structure?
>
> > + res = pcim_addr_devres_alloc(pdev);
> > + if (!res)
> > + return IOMEM_ERR_PTR(-ENOMEM);
> > +
> > + res->type = PCIM_ADDR_DEVRES_TYPE_REGION_MAPPING;
> > + res->bar = bar;
> > +
> > + ret = __pcim_request_region(pdev, bar, name, 0);
> > + if (ret != 0)
> > + goto err_region;
> > +
> > + res->baseaddr = pci_iomap(pdev, bar, 0);
> > + if (!res->baseaddr) {
> > + ret = -EINVAL;
> > + goto err_iomap;
> > + }
> > +
> > + devres_add(&pdev->dev, res);
> > + return res->baseaddr;
> > +
> > +err_iomap:
> > + __pcim_release_region(pdev, bar);
> > +err_region:
> > + pcim_addr_devres_free(res);
> > +
> > + return IOMEM_ERR_PTR(ret);
> > +}
>
> ...
>
> > +static int _pcim_request_region(struct pci_dev *pdev, int bar,
> > const char *name,
> > + int request_flags)
>
> Indentation?
>
> > +{
> > + int ret = 0;
>
> Unneded assignment. Also fix this in other places.
>
> > + struct pcim_addr_devres *res;
> > +
> > + res = pcim_addr_devres_alloc(pdev);
> > + if (!res)
> > + return -ENOMEM;
> > + res->type = PCIM_ADDR_DEVRES_TYPE_REGION;
> > + res->bar = bar;
> > +
> > + ret = __pcim_request_region(pdev, bar, name,
> > request_flags);
> > + if (ret != 0) {
>
> if (ret)
>
> Also fix this in other places.
See above.
Thx for the review,
P.
>
> > + pcim_addr_devres_free(res);
> > + return ret;
> > + }
> > +
> > + devres_add(&pdev->dev, res);
> > + return 0;
> > +}
>
On Tue, 2024-01-16 at 23:27 +0200, [email protected] wrote:
> Mon, Jan 15, 2024 at 03:46:13PM +0100, Philipp Stanner kirjoitti:
> > The old plural devres functions tie PCI's devres implementation to
> > the
> > iomap-table mechanism which unfortunately is not extensible.
> >
> > As the purlal functions are almost never used with more than one
> > bit set
> > in their bit-mask, deprecating them and encouraging users to use
> > the new
> > singular functions instead is reasonable.
> >
> > Furthermore, to make the implementation more consistent and
> > extensible,
> > the plural functions should use the singular functions.
> >
> > Add new wrapper to request / release all BARs.
> > Make the plural functions call into the singular functions.
> > Mark the plural functions as deprecated.
> > Remove as much of the iomap-table-mechanism as possible.
> > Add comments describing the path towards a cleaned-up API.
>
> ...
>
> > static void pcim_iomap_release(struct device *gendev, void *res)
> > {
> > - struct pci_dev *dev = to_pci_dev(gendev);
> > - struct pcim_iomap_devres *this = res;
> > - int i;
> > -
> > - for (i = 0; i < PCIM_IOMAP_MAX; i++)
> > - if (this->table[i])
> > - pci_iounmap(dev, this->table[i]);
> > + /*
> > + * Do nothing. This is legacy code.
> > + *
> > + * Cleanup of the mappings is now done directly through the
> > callbacks
> > + * registered when creating them.
> > + */
> > }
>
> How many users we have? Can't we simply kill it for good?
>
> ...
There are 126 users in the kernel.
When I began with all of this I indeed checked whether we might burn it
completely.
Indeed the majority of driver users are wise enough to do something
like
driver->map0 = pcim_iomap_table(pdev)[0];
driver->map1 = pcim_iomap_table(pdev)[1];
So we could just easily replace it with
+ drivers->map0 = pcim_iomap_region(pdev, 0, "driver");
+ if (IS_ERR(drivers->map0))
Buuuuuuuuuut.. certain people, such as the crypto folks, do indeed not
store the mapping-address in their own driver-struct but call
pcim_iomap_table() to get the address whenever they need it.
That's where I gave up. It's too much work for and would take forever,
even if the code were sane, until you get it all merged.
We want that new API for DRM, that's why I'm working on it in the first
place.
That's why this API should have never been merged back in 2006/7. But
no one dared to touch it for 16 years, so here we are.
>
> > + * This function is DEPRECATED. Do not use it in new code.
>
> We have __deprecated IIRC, can it be used?
It seems the Boss has deprecated __deprecated a few years ago :D
/*
* Don't. Just don't. See commit 771c035372a0 ("deprecate the '__deprecated'
* attribute warnings entirely and for good") for more information.
*
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-deprecated-function-attribute
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-deprecated-type-attribute
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-deprecated-variable-attribute
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Enumerator-Attributes.html#index-deprecated-enumerator-attribute
* clang: https://clang.llvm.org/docs/AttributeReference.html#deprecated
*/
#define __deprecated
>
> ...
>
> > + if (pcim_add_mapping_to_legacy_table(pdev, mapping, bar) !=
> > 0)
>
> Redundant ' != 0' part.
>
> > + goto err_table;
>
> ...
>
> > +static inline bool mask_contains_bar(int mask, int bar)
> > +{
> > + return mask & (1 << bar);
>
> BIT() ?
Yo, we could use it
>
> > +}
>
> But I believe this function is not needed (see below).
>
> ...
>
> > /**
> > - * pcim_iomap_regions - Request and iomap PCI BARs
> > + * pcim_iomap_regions - Request and iomap PCI BARs (DEPRECATED)
> > * @pdev: PCI device to map IO resources for
> > * @mask: Mask of BARs to request and iomap
> > * @name: Name associated with the requests
> > *
> > + * Returns 0 on success, negative error code on failure.
>
> Validate the kernel-doc.
>
> > * Request and iomap regions specified by @mask.
> > + *
> > + * This function is DEPRECATED. Don't use it in new code.
> > + * Use pcim_iomap_region() instead.
> > */
>
> ...
>
> > + for (bar = 0; bar < DEVICE_COUNT_RESOURCE; bar++) {
> > + if (!mask_contains_bar(mask, bar))
> > + continue;
>
> NIH for_each_set_bit().
But wouldn't that iterate too frequently?
It would check all bits, whereas this for-loop (that I inherited from
the existing implementation, I just tried to keep the functionality
identical) loops until DEVICE_COUNT_RESOURCE, which might be !=
sizeof(int).
>
> ...
>
> > + ret = pcim_add_mapping_to_legacy_table(pdev,
> > mapping, bar);
> > + if (ret != 0)
>
> if (ret)
>
> > + goto err;
> > + }
>
> ...
>
> > + err:
>
> Better to name it like
>
> err_unmap_and_remove:
>
> > + while (--bar >= 0) {
>
> while (bar--)
>
> is easier to read.
I won't die on this hill, but do not really agree.
>
> > + pcim_iounmap_region(pdev, bar);
> > + pcim_remove_bar_from_legacy_table(pdev, bar);
> > + }
>
> ...
>
> > +/**
> > + * pcim_request_all_regions - Request all regions
> > + * @pdev: PCI device to map IO resources for
> > + * @name: name associated with the request
> > + *
> > + * Requested regions will automatically be released at driver
> > detach. If desired,
> > + * release individual regions with pcim_release_region() or all of
> > them at once
> > + * with pcim_release_all_regions().
> > + */
>
> Validate kernel-doc.
>
> ...
>
> > + ret = pcim_request_region(pdev, bar, name);
> > + if (ret != 0)
>
> if (ret)
>
> > + goto err;
>
>
> ...
>
> > + short bar;
>
> Why signed?
No reason. The max bar number in PCI is 6.
We can un-sign it
>
> > + int ret = -1;
>
> Oy vei!
>
> ...
>
> > + ret = pcim_request_all_regions(pdev, name);
> > + if (ret != 0)
>
> Here and in plenty other places
>
> if (ret)
>
> > + return ret;
>
> > + for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> > + if (!mask_contains_bar(mask, bar))
> > + continue;
>
> NIH for_each_set_bit()
>
> > + if (!pcim_iomap(pdev, bar, 0))
> > + goto err;
> > + }
>
> ...
>
> > + if (!legacy_iomap_table)
>
> What's wrong with positive check?
>
> > + ret = -ENOMEM;
> > + else
> > + ret = -EINVAL;
>
> Can be even one liner
>
>
> What's wrong with positive check?
>
> ret = legacy_iomap_table ? -EINVAL : -ENOMEM;
Yo, can do that
P.
>
> ...
>
> > + while (--bar >= 0)
>
> while (bar--)
>
> > + pcim_iounmap(pdev, legacy_iomap_table[bar]);
>
> ...
>
> > + for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) {
> > + if (!mask_contains_bar(mask, bar))
> > continue;
>
> NIH for_each_set_bit()
>
> > - pcim_iounmap(pdev, iomap[i]);
> > - pci_release_region(pdev, i);
> > + pcim_iounmap_region(pdev, bar);
> > + pcim_remove_bar_from_legacy_table(pdev, bar);
> > }
>
On Tue, 2024-01-16 at 23:17 +0200, [email protected] wrote:
> Mon, Jan 15, 2024 at 03:46:11PM +0100, Philipp Stanner kirjoitti:
> > ¡Hola!
>
> i? Vim user? :-)
The Dark Side of the Force is the path to many abilities, that some
consider to be... unnatural
https://www.neo-layout.org/
>
> > PCI's devres API suffers several weaknesses:
> >
> > 1. There are functions prefixed with pcim_. Those are always
> > managed
> > counterparts to never-managed functions prefixed with pci_ – or
> > so one
> > would like to think. There are some apparently unmanaged
> > functions
> > (all region-request / release functions, and pci_intx()) which
> > suddenly become managed once the user has initialized the device
> > with
> > pcim_enable_device() instead of pci_enable_device(). This
> > "sometimes
> > yes, sometimes no" nature of those functions is confusing and
> > therefore bug-provoking. In fact, it has already caused a bug in
> > DRM.
> > The last patch in this series fixes that bug.
> > 2. iomappings: Instead of giving each mapping its own callback, the
> > existing API uses a statically allocated struct tracking one
> > mapping
> > per bar. This is not extensible. Especially, you can't create
> > _ranged_ managed mappings that way, which many drivers want.
> > 3. Managed request functions only exist as "plural versions" with a
> > bit-mask as a parameter. That's quite over-engineered
> > considering
> > that each user only ever mapps one, maybe two bars.
> >
> > This series:
> > - add a set of new "singular" devres functions that use devres the
> > way
> > its intended, with one callback per resource.
> > - deprecates the existing iomap-table mechanism.
> > - deprecates the hybrid nature of pci_ functions.
> > - preserves backwards compatibility so that drivers using the
> > existing
> > API won't notice any changes.
> > - adds documentation, especially some warning users about the
> > complicated nature of PCI's devres.
>
> Instead of adding pcim_intx(), please provide proper one for
> pci_alloc_irq_vectors(). Ideally it would be nice to deprecate
> old IRQ management functions in PCI core and delete them in the
> future.
>
In order to deprecate the intermingling with half-managed hyprid devres
in pci.c, you need to have pci_intx() be backwards compatible. Unless
you can remove it at once.
And the least broken way to do that I thought would be pcim_intx(),
because that's consistent with how I make pci_request_region() & Co.
call into their managed counterparts.
There are 25 users of pci_intx().
We'd have to look how many of them call pcim_enable_device() and how
easy they would be to port to... pci_alloc_irq_vectors() you say? I
haven't used that before. Would have to look into it and see how we
could do that.
P.
On Tue, 2024-01-16 at 23:40 +0200, [email protected] wrote:
> Mon, Jan 15, 2024 at 03:46:20PM +0100, Philipp Stanner kirjoitti:
> > Thanks to preceding cleanup steps, pcim_release() is now not needed
> > anymore and can be replaced by pcim_disable_device(), which is the
> > exact
> > counterpart to pcim_enable_device().
> > This permits removing further parts of the old devres API.
> >
> > Replace pcim_release() with pcim_disable_device().
> > Remove the now surplus get_dr() function.
>
> ...
>
> > + devm_add_action(&pdev->dev, pcim_disable_device, pdev);
>
> No error check?
>
> > + return pci_enable_device(pdev);
>
> Maybe
>
> ret = pci_enable_device(...);
> if (ret)
> return ret;
>
> return devm_add_action_or_reset(...)?
>
> I could think of side effects of this, so perhaps the commit message
> and/or
> code needs a comment on why the above proposal can _not_ be used?
>
That proposal can be used, so this was simply a bug.
P.
On Wed, 2024-01-17 at 10:59 +0100, Philipp Stanner wrote:
> On Tue, 2024-01-16 at 23:17 +0200, [email protected] wrote:
> > Mon, Jan 15, 2024 at 03:46:11PM +0100, Philipp Stanner kirjoitti:
[...]
> >
> >
> > > PCI's devres API suffers several weaknesses:
> > >
> > > 1. There are functions prefixed with pcim_. Those are always
> > > managed
> > > counterparts to never-managed functions prefixed with pci_ –
> > > or
> > > so one
> > > would like to think. There are some apparently unmanaged
> > > functions
> > > (all region-request / release functions, and pci_intx()) which
> > > suddenly become managed once the user has initialized the
> > > device
> > > with
> > > pcim_enable_device() instead of pci_enable_device(). This
> > > "sometimes
> > > yes, sometimes no" nature of those functions is confusing and
> > > therefore bug-provoking. In fact, it has already caused a bug
> > > in
> > > DRM.
> > > The last patch in this series fixes that bug.
> > > 2. iomappings: Instead of giving each mapping its own callback,
> > > the
> > > existing API uses a statically allocated struct tracking one
> > > mapping
> > > per bar. This is not extensible. Especially, you can't create
> > > _ranged_ managed mappings that way, which many drivers want.
> > > 3. Managed request functions only exist as "plural versions" with
> > > a
> > > bit-mask as a parameter. That's quite over-engineered
> > > considering
> > > that each user only ever mapps one, maybe two bars.
> > >
> > > This series:
> > > - add a set of new "singular" devres functions that use devres
> > > the
> > > way
> > > its intended, with one callback per resource.
> > > - deprecates the existing iomap-table mechanism.
> > > - deprecates the hybrid nature of pci_ functions.
> > > - preserves backwards compatibility so that drivers using the
> > > existing
> > > API won't notice any changes.
> > > - adds documentation, especially some warning users about the
> > > complicated nature of PCI's devres.
> >
> > Instead of adding pcim_intx(), please provide proper one for
> > pci_alloc_irq_vectors(). Ideally it would be nice to deprecate
> > old IRQ management functions in PCI core and delete them in the
> > future.
> >
>
> In order to deprecate the intermingling with half-managed hyprid
> devres
> in pci.c, you need to have pci_intx() be backwards compatible. Unless
> you can remove it at once.
> And the least broken way to do that I thought would be pcim_intx(),
> because that's consistent with how I make pci_request_region() & Co.
> call into their managed counterparts.
>
> There are 25 users of pci_intx().
> We'd have to look how many of them call pcim_enable_device() and how
> easy they would be to port to... pci_alloc_irq_vectors() you say? I
> haven't used that before. Would have to look into it and see how we
> could do that.
Alright, so I thought about that a bit.
So pci_intx() is the old way to do it and you would like to deprecate
it for good. Understood, makes sense
This series, however, is about deprecating PCI's broken devres
implementation – not about deprecating outdated PCI features in
general.
So I wouldn't like to touch anything here unless cleaning up devres
demands it.
Now the question would be: how can we solve this?
My suggestion would be:
Let's implement pcim_intx(), but only make it visible through
drivers/pci/pci.h. So we won't make it usable for other drivers, don't
EXPORT_SYMBOL() it and basically only have it as a tool to move the
devres-part clearly and cleanly from pci.c to devres.c
Further deprecating old PCI stuff could then be done in a separate
series.
ACK?
P.
>
>
> P.
On Wed, Jan 17, 2024 at 09:54:47AM +0100, Philipp Stanner wrote:
> On Tue, 2024-01-16 at 12:44 -0600, Bjorn Helgaas wrote:
> > On Mon, Jan 15, 2024 at 03:46:12PM +0100, Philipp Stanner wrote:
> > > PCI's devres API is not extensible to ranged mappings and has
> > > bug-provoking features. Improve that by providing better
> > > alternatives.
> >
> > I guess "ranged mappings" means a mapping that doesn't cover an
> > entire BAR? Maybe there's a way to clarify?
>
> That's what it's supposed to mean, yes. We could give it the longer
> title "mappings smaller than the whole BAR" or something, I guess.
"partial BAR mappings"?
> > > to the creation of a set of "pural functions" such as
s/pural/plural/ (I missed this before).
> > > c) The iomap-table mechanism is over-engineered,
> > > complicated and
> > > can by definition not perform bounds checks, thus,
> > > provoking
> > > memory faults: pcim_iomap_table(pdev)[42]
> >
> > Not sure what "pcim_iomap_table(pdev)[42]" means.
>
> That function currently is implemented with this prototype:
> void __iomem * const *pcim_iomap_table(struct pci_dev *pdev);
>
> And apparently, it's intended to index directly over the function. And
> that's how at least part of the users use it indeed.
>
> Here in drivers/crypto/inside-secure/safexcel.c, L.1919 for example:
>
> priv->base = pcim_iomap_table(pdev)[0];
>
> I've never seen something that wonderful in C ever before, so it's not
> surprising that you weren't sure what I mean....
>
> pcim_iomap_table() can not and does not perform any bounds check. If
> you do
>
> void __iomem *mappy_map_mapface = pcim_iomap_table(pdev)[42];
>
> then it will just return random garbage, or it faults. No -EINVAL or
> anything. You won't even get NULL.
>
> That's why this function must die.
No argument except that this example only makes sense after one looks
up the prototype and connects the dots.
Bjorn