2021-01-08 01:26:36

by Saravana Kannan

[permalink] [raw]
Subject: [PATCH v3] driver core: Fix device link device name collision

The device link device's name was of the form:
<supplier-dev-name>--<consumer-dev-name>

This can cause name collision as reported here [1] as device names are
not globally unique. Since device names have to be unique within the
bus/class, add the bus/class name as a prefix to the device names used to
construct the device link device name.

So the devuce link device's name will be of the form:
<supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name>

[1] - https://lore.kernel.org/lkml/[email protected]/

Cc: [email protected]
Fixes: 287905e68dd2 ("driver core: Expose device link details in sysfs")
Reported-by: Michael Walle <[email protected]>
Signed-off-by: Saravana Kannan <[email protected]>
---
Documentation/ABI/testing/sysfs-class-devlink | 4 ++--
.../ABI/testing/sysfs-devices-consumer | 5 +++--
.../ABI/testing/sysfs-devices-supplier | 5 +++--
drivers/base/core.c | 17 +++++++++--------
include/linux/device.h | 12 ++++++++++++
5 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-class-devlink b/Documentation/ABI/testing/sysfs-class-devlink
index b662f747c83e..8a21ce515f61 100644
--- a/Documentation/ABI/testing/sysfs-class-devlink
+++ b/Documentation/ABI/testing/sysfs-class-devlink
@@ -5,8 +5,8 @@ Description:
Provide a place in sysfs for the device link objects in the
kernel at any given time. The name of a device link directory,
denoted as ... above, is of the form <supplier>--<consumer>
- where <supplier> is the supplier device name and <consumer> is
- the consumer device name.
+ where <supplier> is the supplier bus:device name and <consumer>
+ is the consumer bus:device name.

What: /sys/class/devlink/.../auto_remove_on
Date: May 2020
diff --git a/Documentation/ABI/testing/sysfs-devices-consumer b/Documentation/ABI/testing/sysfs-devices-consumer
index 1f06d74d1c3c..0809fda092e6 100644
--- a/Documentation/ABI/testing/sysfs-devices-consumer
+++ b/Documentation/ABI/testing/sysfs-devices-consumer
@@ -4,5 +4,6 @@ Contact: Saravana Kannan <[email protected]>
Description:
The /sys/devices/.../consumer:<consumer> are symlinks to device
links where this device is the supplier. <consumer> denotes the
- name of the consumer in that device link. There can be zero or
- more of these symlinks for a given device.
+ name of the consumer in that device link and is of the form
+ bus:device name. There can be zero or more of these symlinks
+ for a given device.
diff --git a/Documentation/ABI/testing/sysfs-devices-supplier b/Documentation/ABI/testing/sysfs-devices-supplier
index a919e0db5e90..207f5972e98d 100644
--- a/Documentation/ABI/testing/sysfs-devices-supplier
+++ b/Documentation/ABI/testing/sysfs-devices-supplier
@@ -4,5 +4,6 @@ Contact: Saravana Kannan <[email protected]>
Description:
The /sys/devices/.../supplier:<supplier> are symlinks to device
links where this device is the consumer. <supplier> denotes the
- name of the supplier in that device link. There can be zero or
- more of these symlinks for a given device.
+ name of the supplier in that device link and is of the form
+ bus:device name. There can be zero or more of these symlinks
+ for a given device.
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 25e08e5f40bd..4140a69dfe18 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -456,7 +456,9 @@ static int devlink_add_symlinks(struct device *dev,
struct device *con = link->consumer;
char *buf;

- len = max(strlen(dev_name(sup)), strlen(dev_name(con)));
+ len = max(strlen(dev_bus_name(sup)) + strlen(dev_name(sup)),
+ strlen(dev_bus_name(con)) + strlen(dev_name(con)));
+ len += strlen(":");
len += strlen("supplier:") + 1;
buf = kzalloc(len, GFP_KERNEL);
if (!buf)
@@ -470,12 +472,12 @@ static int devlink_add_symlinks(struct device *dev,
if (ret)
goto err_con;

- snprintf(buf, len, "consumer:%s", dev_name(con));
+ snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con), dev_name(con));
ret = sysfs_create_link(&sup->kobj, &link->link_dev.kobj, buf);
if (ret)
goto err_con_dev;

- snprintf(buf, len, "supplier:%s", dev_name(sup));
+ snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup), dev_name(sup));
ret = sysfs_create_link(&con->kobj, &link->link_dev.kobj, buf);
if (ret)
goto err_sup_dev;
@@ -737,8 +739,9 @@ struct device_link *device_link_add(struct device *consumer,

link->link_dev.class = &devlink_class;
device_set_pm_not_required(&link->link_dev);
- dev_set_name(&link->link_dev, "%s--%s",
- dev_name(supplier), dev_name(consumer));
+ dev_set_name(&link->link_dev, "%s:%s--%s:%s",
+ dev_bus_name(supplier), dev_name(supplier),
+ dev_bus_name(consumer), dev_name(consumer));
if (device_register(&link->link_dev)) {
put_device(consumer);
put_device(supplier);
@@ -1808,9 +1811,7 @@ const char *dev_driver_string(const struct device *dev)
* never change once they are set, so they don't need special care.
*/
drv = READ_ONCE(dev->driver);
- return drv ? drv->name :
- (dev->bus ? dev->bus->name :
- (dev->class ? dev->class->name : ""));
+ return drv ? drv->name : dev_bus_name(dev);
}
EXPORT_SYMBOL(dev_driver_string);

diff --git a/include/linux/device.h b/include/linux/device.h
index 89bb8b84173e..1779f90eeb4c 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -609,6 +609,18 @@ static inline const char *dev_name(const struct device *dev)
return kobject_name(&dev->kobj);
}

+/**
+ * dev_bus_name - Return a device's bus/class name, if at all possible
+ * @dev: struct device to get the bus/class name of
+ *
+ * Will return the name of the bus/class the device is attached to. If it is
+ * not attached to a bus/class, an empty string will be returned.
+ */
+static inline const char *dev_bus_name(const struct device *dev)
+{
+ return dev->bus ? dev->bus->name : (dev->class ? dev->class->name : "");
+}
+
__printf(2, 3) int dev_set_name(struct device *dev, const char *name, ...);

#ifdef CONFIG_NUMA
--
2.29.2.729.g45daf8777d-goog


2021-01-08 08:19:54

by Michael Walle

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Fix device link device name collision

Am 2021-01-08 02:24, schrieb Saravana Kannan:
> The device link device's name was of the form:
> <supplier-dev-name>--<consumer-dev-name>
>
> This can cause name collision as reported here [1] as device names are
> not globally unique. Since device names have to be unique within the
> bus/class, add the bus/class name as a prefix to the device names used
> to
> construct the device link device name.
>
> So the devuce link device's name will be of the form:
> <supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name>
>
> [1] -
> https://lore.kernel.org/lkml/[email protected]/
>
> Cc: [email protected]
> Fixes: 287905e68dd2 ("driver core: Expose device link details in
> sysfs")
> Reported-by: Michael Walle <[email protected]>
> Signed-off-by: Saravana Kannan <[email protected]>
> ---
[..]

The changes are missing for the error path and
devlink_remove_symlinks(),
right?

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 4140a69dfe18..385e16d92874 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -485,7 +485,7 @@ static int devlink_add_symlinks(struct device *dev,
goto out;

err_sup_dev:
- snprintf(buf, len, "consumer:%s", dev_name(con));
+ snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con),
dev_name(con));
sysfs_remove_link(&sup->kobj, buf);
err_con_dev:
sysfs_remove_link(&link->link_dev.kobj, "consumer");
@@ -508,7 +508,9 @@ static void devlink_remove_symlinks(struct device
*dev,
sysfs_remove_link(&link->link_dev.kobj, "consumer");
sysfs_remove_link(&link->link_dev.kobj, "supplier");

- len = max(strlen(dev_name(sup)), strlen(dev_name(con)));
+ len = max(strlen(dev_bus_name(sup)) + strlen(dev_name(sup)),
+ strlen(dev_bus_name(con)) + strlen(dev_name(con)));
+ len += strlen(":");
len += strlen("supplier:") + 1;
buf = kzalloc(len, GFP_KERNEL);
if (!buf) {
@@ -516,9 +518,9 @@ static void devlink_remove_symlinks(struct device
*dev,
return;
}

- snprintf(buf, len, "supplier:%s", dev_name(sup));
+ snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup),
dev_name(sup));
sysfs_remove_link(&con->kobj, buf);
- snprintf(buf, len, "consumer:%s", dev_name(con));
+ snprintf(buf, len, "consumer:%s:%s", dev_bus_name(sup),
dev_name(con));
sysfs_remove_link(&sup->kobj, buf);
kfree(buf);
}

With these changes:

Tested-by: Michael Walle <[email protected]>

This at least make the warning go away.
BUT, there is somesthing strange or at least I don't get it:

# find /sys/bus/pci/devices/0000:00:00.0/ -name "consumer\:*"
# find /sys/bus/pci/devices/0000:00:00.1/ -name "consumer\:*"
/sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1:04
/sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1

enetc0 (0000:00:00.0) has no consumers while enetc1 (0000:00:00.1)
has ones. Although both have PHYs connected. Here are the
corresonding device tree entries:

enetc0:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts?h=v5.11-rc2#n81

enetc1:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts?h=v5.11-rc2#n21

Why is there a link between enetc1 and its PHY (and MDIO bus)
but not for enetc0?

btw, here are all links:

# ls /sys/class/devlink/
pci:0000:00:00.1--mdio_bus:0000:00:00.1
platform:5000000.iommu--pci:0000:00:00.0
platform:5000000.iommu--pci:0000:00:00.1
platform:5000000.iommu--pci:0000:00:00.2
platform:5000000.iommu--pci:0000:00:00.3
platform:5000000.iommu--pci:0000:00:00.5
platform:5000000.iommu--pci:0000:00:00.6
platform:5000000.iommu--pci:0001:00:00.0
platform:5000000.iommu--pci:0002:00:00.0
platform:5000000.iommu--platform:2140000.mmc
platform:5000000.iommu--platform:2150000.mmc
platform:5000000.iommu--platform:22c0000.dma-controller
platform:5000000.iommu--platform:3100000.usb
platform:5000000.iommu--platform:3110000.usb
platform:5000000.iommu--platform:3200000.sata
platform:5000000.iommu--platform:8000000.crypto
platform:5000000.iommu--platform:8380000.dma-controller
platform:5000000.iommu--platform:f080000.display
platform:f1f0000.clock-controller--platform:f080000.display
regulator:regulator.0--i2c:0-0050
regulator:regulator.0--i2c:1-0057
regulator:regulator.0--i2c:2-0050
regulator:regulator.0--platform:3200000.sata

--
-michael

2021-01-08 17:26:17

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Fix device link device name collision

On Fri, Jan 8, 2021 at 12:16 AM Michael Walle <[email protected]> wrote:
>
> Am 2021-01-08 02:24, schrieb Saravana Kannan:
> > The device link device's name was of the form:
> > <supplier-dev-name>--<consumer-dev-name>
> >
> > This can cause name collision as reported here [1] as device names are
> > not globally unique. Since device names have to be unique within the
> > bus/class, add the bus/class name as a prefix to the device names used
> > to
> > construct the device link device name.
> >
> > So the devuce link device's name will be of the form:
> > <supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name>
> >
> > [1] -
> > https://lore.kernel.org/lkml/[email protected]/
> >
> > Cc: [email protected]
> > Fixes: 287905e68dd2 ("driver core: Expose device link details in
> > sysfs")
> > Reported-by: Michael Walle <[email protected]>
> > Signed-off-by: Saravana Kannan <[email protected]>
> > ---
> [..]
>
> The changes are missing for the error path and
> devlink_remove_symlinks(),
> right?

Removing symlinks doesn't need the name. Just needs the "handle". So
we are good there.

>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 4140a69dfe18..385e16d92874 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -485,7 +485,7 @@ static int devlink_add_symlinks(struct device *dev,
> goto out;
>
> err_sup_dev:
> - snprintf(buf, len, "consumer:%s", dev_name(con));
> + snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con),
> dev_name(con));
> sysfs_remove_link(&sup->kobj, buf);
> err_con_dev:
> sysfs_remove_link(&link->link_dev.kobj, "consumer");
> @@ -508,7 +508,9 @@ static void devlink_remove_symlinks(struct device
> *dev,
> sysfs_remove_link(&link->link_dev.kobj, "consumer");
> sysfs_remove_link(&link->link_dev.kobj, "supplier");
>
> - len = max(strlen(dev_name(sup)), strlen(dev_name(con)));
> + len = max(strlen(dev_bus_name(sup)) + strlen(dev_name(sup)),
> + strlen(dev_bus_name(con)) + strlen(dev_name(con)));
> + len += strlen(":");
> len += strlen("supplier:") + 1;
> buf = kzalloc(len, GFP_KERNEL);
> if (!buf) {
> @@ -516,9 +518,9 @@ static void devlink_remove_symlinks(struct device
> *dev,
> return;
> }
>
> - snprintf(buf, len, "supplier:%s", dev_name(sup));
> + snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup),
> dev_name(sup));
> sysfs_remove_link(&con->kobj, buf);
> - snprintf(buf, len, "consumer:%s", dev_name(con));
> + snprintf(buf, len, "consumer:%s:%s", dev_bus_name(sup),
> dev_name(con));
> sysfs_remove_link(&sup->kobj, buf);
> kfree(buf);
> }
>
> With these changes:
>
> Tested-by: Michael Walle <[email protected]>

Greg,

I think it's good to pick up this version if you don't see any issues.

>
> This at least make the warning go away.

Phew!

> BUT, there is somesthing strange or at least I don't get it:
>
> # find /sys/bus/pci/devices/0000:00:00.0/ -name "consumer\:*"
> # find /sys/bus/pci/devices/0000:00:00.1/ -name "consumer\:*"
> /sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1:04
> /sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1
>
> enetc0 (0000:00:00.0) has no consumers while enetc1 (0000:00:00.1)
> has ones. Although both have PHYs connected. Here are the
> corresonding device tree entries:
>
> enetc0:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts?h=v5.11-rc2#n81
>
> enetc1:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts?h=v5.11-rc2#n21
>
> Why is there a link between enetc1 and its PHY (and MDIO bus)
> but not for enetc0?

So a lot of subtle things could be going on here that make it look
like it's not working correctly but it's actually working fine. Since
fw_devlink=permissive is the default mode, all links that are created
are SYNC_STATE_ONLY links. These links are deleted after their
consumers probe. So if you really want to see all the "real" links
persist, try booting with fw_devlink=on. You might have boot issues
though -- I'm working on that separately [1]. Also, SYNC_STATE_ONLY
links can be created between the parent of a consumer and the supplier
(I won't get into the why here) depending on some ordering -- so that
might be causing some spurious looking links, but they aren't.

Another way to do what you are trying to do is to enable the logs in
device_link_add() and look at them to see if all the links are created
as you'd expect.

> btw, here are all links:
>
> # ls /sys/class/devlink/
> pci:0000:00:00.1--mdio_bus:0000:00:00.1
> platform:5000000.iommu--pci:0000:00:00.0
> platform:5000000.iommu--pci:0000:00:00.1
> platform:5000000.iommu--pci:0000:00:00.2
> platform:5000000.iommu--pci:0000:00:00.3
> platform:5000000.iommu--pci:0000:00:00.5
> platform:5000000.iommu--pci:0000:00:00.6
> platform:5000000.iommu--pci:0001:00:00.0
> platform:5000000.iommu--pci:0002:00:00.0
> platform:5000000.iommu--platform:2140000.mmc
> platform:5000000.iommu--platform:2150000.mmc
> platform:5000000.iommu--platform:22c0000.dma-controller
> platform:5000000.iommu--platform:3100000.usb
> platform:5000000.iommu--platform:3110000.usb
> platform:5000000.iommu--platform:3200000.sata
> platform:5000000.iommu--platform:8000000.crypto
> platform:5000000.iommu--platform:8380000.dma-controller
> platform:5000000.iommu--platform:f080000.display
> platform:f1f0000.clock-controller--platform:f080000.display
> regulator:regulator.0--i2c:0-0050
> regulator:regulator.0--i2c:1-0057
> regulator:regulator.0--i2c:2-0050
> regulator:regulator.0--platform:3200000.sata

As you can see, most of the links that fw_devlink created are gone.
Because all the consumers probed. Any remaining ones you see here are
non-SYNC_STATE_ONLY links created by the driver/frameworks or cases
where consumers haven't probed. My guess is that only the first one is
of this criteria and it doesn't hurt anything here. Try booting with
fw_devlink=on and check this list. That'll give you a better idea.

-Saravana

2021-01-09 16:53:04

by Michael Walle

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Fix device link device name collision

Am 2021-01-08 18:22, schrieb Saravana Kannan:
> On Fri, Jan 8, 2021 at 12:16 AM Michael Walle <[email protected]> wrote:
>>
>> Am 2021-01-08 02:24, schrieb Saravana Kannan:
>> > The device link device's name was of the form:
>> > <supplier-dev-name>--<consumer-dev-name>
>> >
>> > This can cause name collision as reported here [1] as device names are
>> > not globally unique. Since device names have to be unique within the
>> > bus/class, add the bus/class name as a prefix to the device names used
>> > to
>> > construct the device link device name.
>> >
>> > So the devuce link device's name will be of the form:
>> > <supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name>
>> >
>> > [1] -
>> > https://lore.kernel.org/lkml/[email protected]/
>> >
>> > Cc: [email protected]
>> > Fixes: 287905e68dd2 ("driver core: Expose device link details in
>> > sysfs")
>> > Reported-by: Michael Walle <[email protected]>
>> > Signed-off-by: Saravana Kannan <[email protected]>
>> > ---
>> [..]
>>
>> The changes are missing for the error path and
>> devlink_remove_symlinks(),
>> right?
>
> Removing symlinks doesn't need the name. Just needs the "handle". So
> we are good there.

I don't get it. What is the "handle"? Without the patch below
kernfs_remove_by_name() in sysfs_remove_link will return -ENOENT. With
the patch it will return 0.

And even if it would work, how is this even logical:

snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con), dev_name(con));
ret = sysfs_create_link(&sup->kobj, &link->link_dev.kobj, buf);
if (ret)
goto err_con_dev;
snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup), dev_name(sup));
ret = sysfs_create_link(&con->kobj, &link->link_dev.kobj, buf);
if (ret)
goto err_sup_dev;
[..]
err_sup_dev:
snprintf(buf, len, "consumer:%s", dev_name(con));
sysfs_remove_link(&sup->kobj, buf);

You call sysfs_create_link("consumer:bus_name:dev_name") but the
corresponding rollback is sysfs_remove_link("consumer:dev_name"), that
is super confusing.

>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 4140a69dfe18..385e16d92874 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -485,7 +485,7 @@ static int devlink_add_symlinks(struct device
>> *dev,
>> goto out;
>>
>> err_sup_dev:
>> - snprintf(buf, len, "consumer:%s", dev_name(con));
>> + snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con),
>> dev_name(con));
>> sysfs_remove_link(&sup->kobj, buf);
>> err_con_dev:
>> sysfs_remove_link(&link->link_dev.kobj, "consumer");
>> @@ -508,7 +508,9 @@ static void devlink_remove_symlinks(struct device
>> *dev,
>> sysfs_remove_link(&link->link_dev.kobj, "consumer");
>> sysfs_remove_link(&link->link_dev.kobj, "supplier");
>>
>> - len = max(strlen(dev_name(sup)), strlen(dev_name(con)));
>> + len = max(strlen(dev_bus_name(sup)) + strlen(dev_name(sup)),
>> + strlen(dev_bus_name(con)) + strlen(dev_name(con)));
>> + len += strlen(":");
>> len += strlen("supplier:") + 1;
>> buf = kzalloc(len, GFP_KERNEL);
>> if (!buf) {
>> @@ -516,9 +518,9 @@ static void devlink_remove_symlinks(struct device
>> *dev,
>> return;
>> }
>>
>> - snprintf(buf, len, "supplier:%s", dev_name(sup));
>> + snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup),
>> dev_name(sup));
>> sysfs_remove_link(&con->kobj, buf);
>> - snprintf(buf, len, "consumer:%s", dev_name(con));
>> + snprintf(buf, len, "consumer:%s:%s", dev_bus_name(sup),
>> dev_name(con));

btw this should be dev_bus_name(con).

>> sysfs_remove_link(&sup->kobj, buf);
>> kfree(buf);
>> }
>>
>> With these changes:
>>
>> Tested-by: Michael Walle <[email protected]>
>
> Greg,
>
> I think it's good to pick up this version if you don't see any issues.

Why so fast?


>> This at least make the warning go away.
>
> Phew!
>
>> BUT, there is somesthing strange or at least I don't get it:
>>
>> # find /sys/bus/pci/devices/0000:00:00.0/ -name "consumer\:*"
>> # find /sys/bus/pci/devices/0000:00:00.1/ -name "consumer\:*"
>> /sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1:04
>> /sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1
>>
>> enetc0 (0000:00:00.0) has no consumers while enetc1 (0000:00:00.1)
>> has ones. Although both have PHYs connected. Here are the
>> corresonding device tree entries:
>>
>> enetc0:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts?h=v5.11-rc2#n81
>>
>> enetc1:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts?h=v5.11-rc2#n21
>>
>> Why is there a link between enetc1 and its PHY (and MDIO bus)
>> but not for enetc0?
>
> So a lot of subtle things could be going on here that make it look
> like it's not working correctly but it's actually working fine. Since
> fw_devlink=permissive is the default mode, all links that are created
> are SYNC_STATE_ONLY links. These links are deleted after their
> consumers probe. So if you really want to see all the "real" links
> persist, try booting with fw_devlink=on. You might have boot issues
> though -- I'm working on that separately [1]. Also, SYNC_STATE_ONLY
> links can be created between the parent of a consumer and the supplier
> (I won't get into the why here) depending on some ordering -- so that
> might be causing some spurious looking links, but they aren't.
>
> Another way to do what you are trying to do is to enable the logs in
> device_link_add() and look at them to see if all the links are created
> as you'd expect.
>
>> btw, here are all links:
>>
>> # ls /sys/class/devlink/
>> pci:0000:00:00.1--mdio_bus:0000:00:00.1
>> platform:5000000.iommu--pci:0000:00:00.0
>> platform:5000000.iommu--pci:0000:00:00.1
>> platform:5000000.iommu--pci:0000:00:00.2
>> platform:5000000.iommu--pci:0000:00:00.3
>> platform:5000000.iommu--pci:0000:00:00.5
>> platform:5000000.iommu--pci:0000:00:00.6
>> platform:5000000.iommu--pci:0001:00:00.0
>> platform:5000000.iommu--pci:0002:00:00.0
>> platform:5000000.iommu--platform:2140000.mmc
>> platform:5000000.iommu--platform:2150000.mmc
>> platform:5000000.iommu--platform:22c0000.dma-controller
>> platform:5000000.iommu--platform:3100000.usb
>> platform:5000000.iommu--platform:3110000.usb
>> platform:5000000.iommu--platform:3200000.sata
>> platform:5000000.iommu--platform:8000000.crypto
>> platform:5000000.iommu--platform:8380000.dma-controller
>> platform:5000000.iommu--platform:f080000.display
>> platform:f1f0000.clock-controller--platform:f080000.display
>> regulator:regulator.0--i2c:0-0050
>> regulator:regulator.0--i2c:1-0057
>> regulator:regulator.0--i2c:2-0050
>> regulator:regulator.0--platform:3200000.sata
>
> As you can see, most of the links that fw_devlink created are gone.
> Because all the consumers probed. Any remaining ones you see here are
> non-SYNC_STATE_ONLY links created by the driver/frameworks or cases
> where consumers haven't probed. My guess is that only the first one is
> of this criteria and it doesn't hurt anything here. Try booting with
> fw_devlink=on and check this list. That'll give you a better idea.

Thanks for explaining. I'll try that.

-michael

2021-01-09 17:12:11

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Fix device link device name collision

On Sat, Jan 9, 2021 at 8:49 AM Michael Walle <[email protected]> wrote:
>
> Am 2021-01-08 18:22, schrieb Saravana Kannan:
> > On Fri, Jan 8, 2021 at 12:16 AM Michael Walle <[email protected]> wrote:
> >>
> >> Am 2021-01-08 02:24, schrieb Saravana Kannan:
> >> > The device link device's name was of the form:
> >> > <supplier-dev-name>--<consumer-dev-name>
> >> >
> >> > This can cause name collision as reported here [1] as device names are
> >> > not globally unique. Since device names have to be unique within the
> >> > bus/class, add the bus/class name as a prefix to the device names used
> >> > to
> >> > construct the device link device name.
> >> >
> >> > So the devuce link device's name will be of the form:
> >> > <supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name>
> >> >
> >> > [1] -
> >> > https://lore.kernel.org/lkml/[email protected]/
> >> >
> >> > Cc: [email protected]
> >> > Fixes: 287905e68dd2 ("driver core: Expose device link details in
> >> > sysfs")
> >> > Reported-by: Michael Walle <[email protected]>
> >> > Signed-off-by: Saravana Kannan <[email protected]>
> >> > ---
> >> [..]
> >>
> >> The changes are missing for the error path and
> >> devlink_remove_symlinks(),
> >> right?
> >
> > Removing symlinks doesn't need the name. Just needs the "handle". So
> > we are good there.
>
> I don't get it. What is the "handle"? Without the patch below
> kernfs_remove_by_name() in sysfs_remove_link will return -ENOENT. With
> the patch it will return 0.
>
> And even if it would work, how is this even logical:

Ah sorry, I confused it with removing device attrs. I need to fix up
the symlink remove path.

>
> snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con), dev_name(con));
> ret = sysfs_create_link(&sup->kobj, &link->link_dev.kobj, buf);
> if (ret)
> goto err_con_dev;
> snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup), dev_name(sup));
> ret = sysfs_create_link(&con->kobj, &link->link_dev.kobj, buf);
> if (ret)
> goto err_sup_dev;
> [..]
> err_sup_dev:
> snprintf(buf, len, "consumer:%s", dev_name(con));
> sysfs_remove_link(&sup->kobj, buf);
>
> You call sysfs_create_link("consumer:bus_name:dev_name") but the
> corresponding rollback is sysfs_remove_link("consumer:dev_name"), that
> is super confusing.
>
> >> diff --git a/drivers/base/core.c b/drivers/base/core.c
> >> index 4140a69dfe18..385e16d92874 100644
> >> --- a/drivers/base/core.c
> >> +++ b/drivers/base/core.c
> >> @@ -485,7 +485,7 @@ static int devlink_add_symlinks(struct device
> >> *dev,
> >> goto out;
> >>
> >> err_sup_dev:
> >> - snprintf(buf, len, "consumer:%s", dev_name(con));
> >> + snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con),
> >> dev_name(con));
> >> sysfs_remove_link(&sup->kobj, buf);
> >> err_con_dev:
> >> sysfs_remove_link(&link->link_dev.kobj, "consumer");
> >> @@ -508,7 +508,9 @@ static void devlink_remove_symlinks(struct device
> >> *dev,
> >> sysfs_remove_link(&link->link_dev.kobj, "consumer");
> >> sysfs_remove_link(&link->link_dev.kobj, "supplier");
> >>
> >> - len = max(strlen(dev_name(sup)), strlen(dev_name(con)));
> >> + len = max(strlen(dev_bus_name(sup)) + strlen(dev_name(sup)),
> >> + strlen(dev_bus_name(con)) + strlen(dev_name(con)));
> >> + len += strlen(":");
> >> len += strlen("supplier:") + 1;
> >> buf = kzalloc(len, GFP_KERNEL);
> >> if (!buf) {
> >> @@ -516,9 +518,9 @@ static void devlink_remove_symlinks(struct device
> >> *dev,
> >> return;
> >> }
> >>
> >> - snprintf(buf, len, "supplier:%s", dev_name(sup));
> >> + snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup),
> >> dev_name(sup));
> >> sysfs_remove_link(&con->kobj, buf);
> >> - snprintf(buf, len, "consumer:%s", dev_name(con));
> >> + snprintf(buf, len, "consumer:%s:%s", dev_bus_name(sup),
> >> dev_name(con));

Ah I completely skimmed over this code thinking it was code from my
patch. Like I said, I was struggling with the length of the email due
to the logs. Anyway, I'll fix up the remove symlink path too. Thanks
for catching that.

> btw this should be dev_bus_name(con).
>
> >> sysfs_remove_link(&sup->kobj, buf);
> >> kfree(buf);
> >> }
> >>
> >> With these changes:
> >>
> >> Tested-by: Michael Walle <[email protected]>
> >
> > Greg,
> >
> > I think it's good to pick up this version if you don't see any issues.
>
> Why so fast?

Sorry, didn't mean to rush. I was just trying to say I wasn't planning
on a v4 because I thought your Tested-by was for my unchanged v4, but
clearly I need to send a v4.

-Saravana