2022-04-21 09:48:27

by Krzysztof Kozlowski

[permalink] [raw]
Subject: [PATCH v7 00/12] Fix broken usage of driver_override (and kfree of static memory)

Hi,

This is a continuation of my old patchset from 2019. [1]
Back then, few drivers set driver_override wrong. I fixed Exynos
in a different way after discussions. QCOM NGD was not fixed
and a new user appeared - IMX SCU.

It seems "char *" in driver_override looks too consty, so we
tend to make a mistake of storing there string literals.

Changes since latest v7
=======================
1. Patch #1: remove out_free label, document clearing override in kerneldoc and
in code-comments (Andy).
2. Patch #12 (rpmsg): do not duplicate string (Biju).

Changes since latest v6
=======================
1. Patch #1: Don't check for !dev and handle len==0 (Andy).
2. New patch #11 (rpmsg): split constifying of local variable to a new patch.

Changes since latest v5
=======================
1. Patch #11 (rpmsg): split from previous patch 11 - easier to understand the
need of it.
2. Fix build issue in patch 12 (rpmsg).

Changes since latest v4
=======================
1. Correct commit msgs and comments after Andy's review.
2. Re-order code in new helper (patch #1) (Andy).
3. Add tags.

Changes since latest v3
=======================
1. Wrap comments, extend comment in driver_set_override() about newline.
2. Minor commit msg fixes.
3. Add tags.

Changes since latest v2
=======================
1. Make all driver_override fields as "const char *", just like SPI
and VDPA. (Mark)
2. Move "count" check to the new helper and add "count" argument. (Michael)
3. Fix typos in docs, patch subject. Extend doc. (Michael, Bjorn)
4. Compare pointers to reduce number of string readings in the helper.
5. Fix clk-imx return value.

Changes since latest v1 (not the old 2019 solution):
====================================================
https://lore.kernel.org/all/[email protected]/
1. Add helper for setting driver_override.
2. Use the helper.

Dependencies (and stable):
==========================
1. All patches, including last three fixes, depend on the first patch
introducing the helper.
2. The last three commits - fixes - are probably not backportable
directly, because of this dependency. I don't know how to express
this dependency here, since stable-kernel-rules.rst mentions only commits as
possible dependencies.

[1] https://lore.kernel.org/all/[email protected]/

Best regards,
Krzysztof

Krzysztof Kozlowski (12):
driver: platform: Add helper for safer setting of driver_override
amba: Use driver_set_override() instead of open-coding
fsl-mc: Use driver_set_override() instead of open-coding
hv: Use driver_set_override() instead of open-coding
PCI: Use driver_set_override() instead of open-coding
s390/cio: Use driver_set_override() instead of open-coding
spi: Use helper for safer setting of driver_override
vdpa: Use helper for safer setting of driver_override
clk: imx: scu: Fix kfree() of static memory on setting driver_override
slimbus: qcom-ngd: Fix kfree() of static memory on setting
driver_override
rpmsg: Constify local variable in field store macro
rpmsg: Fix kfree() of static memory on setting driver_override

drivers/amba/bus.c | 28 ++-----------
drivers/base/driver.c | 69 +++++++++++++++++++++++++++++++++
drivers/base/platform.c | 28 ++-----------
drivers/bus/fsl-mc/fsl-mc-bus.c | 25 ++----------
drivers/clk/imx/clk-scu.c | 7 +++-
drivers/hv/vmbus_drv.c | 28 ++-----------
drivers/pci/pci-sysfs.c | 28 ++-----------
drivers/rpmsg/rpmsg_core.c | 3 +-
drivers/rpmsg/rpmsg_internal.h | 13 ++++++-
drivers/rpmsg/rpmsg_ns.c | 14 ++++++-
drivers/s390/cio/cio.h | 6 ++-
drivers/s390/cio/css.c | 28 ++-----------
drivers/slimbus/qcom-ngd-ctrl.c | 13 ++++++-
drivers/spi/spi.c | 26 ++-----------
drivers/vdpa/vdpa.c | 29 ++------------
include/linux/amba/bus.h | 6 ++-
include/linux/device/driver.h | 2 +
include/linux/fsl/mc.h | 6 ++-
include/linux/hyperv.h | 6 ++-
include/linux/pci.h | 6 ++-
include/linux/platform_device.h | 6 ++-
include/linux/rpmsg.h | 6 ++-
include/linux/spi/spi.h | 2 +
include/linux/vdpa.h | 4 +-
24 files changed, 184 insertions(+), 205 deletions(-)

--
2.32.0


2022-04-21 14:28:12

by Krzysztof Kozlowski

[permalink] [raw]
Subject: [PATCH v7 04/12] hv: Use driver_set_override() instead of open-coding

Use a helper to set driver_override to the reduce amount of duplicated
code. Make the driver_override field const char, because it is not
modified by the core and it matches other subsystems.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Michael Kelley <[email protected]>
---
drivers/hv/vmbus_drv.c | 28 ++++------------------------
include/linux/hyperv.h | 6 +++++-
2 files changed, 9 insertions(+), 25 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 14de17087864..607e40aba18e 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -575,31 +575,11 @@ static ssize_t driver_override_store(struct device *dev,
const char *buf, size_t count)
{
struct hv_device *hv_dev = device_to_hv_device(dev);
- char *driver_override, *old, *cp;
-
- /* We need to keep extra room for a newline */
- if (count >= (PAGE_SIZE - 1))
- return -EINVAL;
-
- driver_override = kstrndup(buf, count, GFP_KERNEL);
- if (!driver_override)
- return -ENOMEM;
-
- cp = strchr(driver_override, '\n');
- if (cp)
- *cp = '\0';
-
- device_lock(dev);
- old = hv_dev->driver_override;
- if (strlen(driver_override)) {
- hv_dev->driver_override = driver_override;
- } else {
- kfree(driver_override);
- hv_dev->driver_override = NULL;
- }
- device_unlock(dev);
+ int ret;

- kfree(old);
+ ret = driver_set_override(dev, &hv_dev->driver_override, buf, count);
+ if (ret)
+ return ret;

return count;
}
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index fe2e0179ed51..12e2336b23b7 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1257,7 +1257,11 @@ struct hv_device {
u16 device_id;

struct device device;
- char *driver_override; /* Driver name to force a match */
+ /*
+ * Driver name to force a match. Do not set directly, because core
+ * frees it. Use driver_set_override() to set or clear it.
+ */
+ const char *driver_override;

struct vmbus_channel *channel;
struct kset *channels_kset;
--
2.32.0

2022-04-22 07:16:18

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v7 00/12] Fix broken usage of driver_override (and kfree of static memory)

On 19/04/2022 13:34, Krzysztof Kozlowski wrote:

Hi Greg, Rafael,

The patchset was for some time on the lists, got some reviews, some
changes/feedback which I hope I applied/responded.

Entire set depends on the driver core changes, so maybe you could pick
up everything via drivers core tree?

> Dependencies (and stable):
> ==========================
> 1. All patches, including last three fixes, depend on the first patch
> introducing the helper.
> 2. The last three commits - fixes - are probably not backportable
> directly, because of this dependency. I don't know how to express
> this dependency here, since stable-kernel-rules.rst mentions only commits as
> possible dependencies.


Best regards,
Krzysztof

2022-04-22 07:33:45

by Krzysztof Kozlowski

[permalink] [raw]
Subject: [PATCH v7 12/12] rpmsg: Fix kfree() of static memory on setting driver_override

The driver_override field from platform driver should not be initialized
from static memory (string literal) because the core later kfree() it,
for example when driver_override is set via sysfs.

Use dedicated helper to set driver_override properly.

Fixes: 950a7388f02b ("rpmsg: Turn name service into a stand alone driver")
Fixes: c0cdc19f84a4 ("rpmsg: Driver for user space endpoint interface")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
---
drivers/rpmsg/rpmsg_internal.h | 13 +++++++++++--
drivers/rpmsg/rpmsg_ns.c | 14 ++++++++++++--
include/linux/rpmsg.h | 6 ++++--
3 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/rpmsg/rpmsg_internal.h b/drivers/rpmsg/rpmsg_internal.h
index d4b23fd019a8..3e81642238d2 100644
--- a/drivers/rpmsg/rpmsg_internal.h
+++ b/drivers/rpmsg/rpmsg_internal.h
@@ -94,10 +94,19 @@ int rpmsg_release_channel(struct rpmsg_device *rpdev,
*/
static inline int rpmsg_ctrldev_register_device(struct rpmsg_device *rpdev)
{
+ int ret;
+
strcpy(rpdev->id.name, "rpmsg_ctrl");
- rpdev->driver_override = "rpmsg_ctrl";
+ ret = driver_set_override(&rpdev->dev, &rpdev->driver_override,
+ rpdev->id.name, strlen(rpdev->id.name));
+ if (ret)
+ return ret;
+
+ ret = rpmsg_register_device(rpdev);
+ if (ret)
+ kfree(rpdev->driver_override);

- return rpmsg_register_device(rpdev);
+ return ret;
}

#endif
diff --git a/drivers/rpmsg/rpmsg_ns.c b/drivers/rpmsg/rpmsg_ns.c
index 762ff1ae279f..8eb8f328237e 100644
--- a/drivers/rpmsg/rpmsg_ns.c
+++ b/drivers/rpmsg/rpmsg_ns.c
@@ -20,12 +20,22 @@
*/
int rpmsg_ns_register_device(struct rpmsg_device *rpdev)
{
+ int ret;
+
strcpy(rpdev->id.name, "rpmsg_ns");
- rpdev->driver_override = "rpmsg_ns";
+ ret = driver_set_override(&rpdev->dev, &rpdev->driver_override,
+ rpdev->id.name, strlen(rpdev->id.name));
+ if (ret)
+ return ret;
+
rpdev->src = RPMSG_NS_ADDR;
rpdev->dst = RPMSG_NS_ADDR;

- return rpmsg_register_device(rpdev);
+ ret = rpmsg_register_device(rpdev);
+ if (ret)
+ kfree(rpdev->driver_override);
+
+ return ret;
}
EXPORT_SYMBOL(rpmsg_ns_register_device);

diff --git a/include/linux/rpmsg.h b/include/linux/rpmsg.h
index 02fa9116cd60..20c8cd1cde21 100644
--- a/include/linux/rpmsg.h
+++ b/include/linux/rpmsg.h
@@ -41,7 +41,9 @@ struct rpmsg_channel_info {
* rpmsg_device - device that belong to the rpmsg bus
* @dev: the device struct
* @id: device id (used to match between rpmsg drivers and devices)
- * @driver_override: driver name to force a match
+ * @driver_override: driver name to force a match; do not set directly,
+ * because core frees it; use driver_set_override() to
+ * set or clear it.
* @src: local address
* @dst: destination address
* @ept: the rpmsg endpoint of this channel
@@ -51,7 +53,7 @@ struct rpmsg_channel_info {
struct rpmsg_device {
struct device dev;
struct rpmsg_device_id id;
- char *driver_override;
+ const char *driver_override;
u32 src;
u32 dst;
struct rpmsg_endpoint *ept;
--
2.32.0

2022-04-22 18:14:03

by Krzysztof Kozlowski

[permalink] [raw]
Subject: [PATCH v7 11/12] rpmsg: Constify local variable in field store macro

Memory pointed by variable 'old' in field store macro is not modified,
so it can be made a pointer to const.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
---
drivers/rpmsg/rpmsg_core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/rpmsg/rpmsg_core.c b/drivers/rpmsg/rpmsg_core.c
index 79368a957d89..95fc283f6af7 100644
--- a/drivers/rpmsg/rpmsg_core.c
+++ b/drivers/rpmsg/rpmsg_core.c
@@ -400,7 +400,8 @@ field##_store(struct device *dev, struct device_attribute *attr, \
const char *buf, size_t sz) \
{ \
struct rpmsg_device *rpdev = to_rpmsg_device(dev); \
- char *new, *old; \
+ const char *old; \
+ char *new; \
\
new = kstrndup(buf, sz, GFP_KERNEL); \
if (!new) \
--
2.32.0

2022-04-22 20:48:38

by Krzysztof Kozlowski

[permalink] [raw]
Subject: [PATCH v7 07/12] spi: Use helper for safer setting of driver_override

Use a helper to set driver_override to the reduce amount of duplicated
code.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
---
drivers/spi/spi.c | 26 ++++----------------------
include/linux/spi/spi.h | 2 ++
2 files changed, 6 insertions(+), 22 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 890ff46c784a..be8f1a1e21b2 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -71,29 +71,11 @@ static ssize_t driver_override_store(struct device *dev,
const char *buf, size_t count)
{
struct spi_device *spi = to_spi_device(dev);
- const char *end = memchr(buf, '\n', count);
- const size_t len = end ? end - buf : count;
- const char *driver_override, *old;
-
- /* We need to keep extra room for a newline when displaying value */
- if (len >= (PAGE_SIZE - 1))
- return -EINVAL;
-
- driver_override = kstrndup(buf, len, GFP_KERNEL);
- if (!driver_override)
- return -ENOMEM;
+ int ret;

- device_lock(dev);
- old = spi->driver_override;
- if (len) {
- spi->driver_override = driver_override;
- } else {
- /* Empty string, disable driver override */
- spi->driver_override = NULL;
- kfree(driver_override);
- }
- device_unlock(dev);
- kfree(old);
+ ret = driver_set_override(dev, &spi->driver_override, buf, count);
+ if (ret)
+ return ret;

return count;
}
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index 5f8c063ddff4..f0177f9b6e13 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -138,6 +138,8 @@ extern int spi_delay_exec(struct spi_delay *_delay, struct spi_transfer *xfer);
* for driver coldplugging, and in uevents used for hotplugging
* @driver_override: If the name of a driver is written to this attribute, then
* the device will bind to the named driver and only the named driver.
+ * Do not set directly, because core frees it; use driver_set_override() to
+ * set or clear it.
* @cs_gpiod: gpio descriptor of the chipselect line (optional, NULL when
* not using a GPIO line)
* @word_delay: delay to be inserted between consecutive
--
2.32.0

2022-04-22 21:06:49

by Krzysztof Kozlowski

[permalink] [raw]
Subject: [PATCH v7 09/12] clk: imx: scu: Fix kfree() of static memory on setting driver_override

The driver_override field from platform driver should not be initialized
from static memory (string literal) because the core later kfree() it,
for example when driver_override is set via sysfs.

Use dedicated helper to set driver_override properly.

Fixes: 77d8f3068c63 ("clk: imx: scu: add two cells binding support")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Acked-by: Stephen Boyd <[email protected]>
---
drivers/clk/imx/clk-scu.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/imx/clk-scu.c b/drivers/clk/imx/clk-scu.c
index ed3c01d2e8ae..4996f1d94657 100644
--- a/drivers/clk/imx/clk-scu.c
+++ b/drivers/clk/imx/clk-scu.c
@@ -683,7 +683,12 @@ struct clk_hw *imx_clk_scu_alloc_dev(const char *name,
return ERR_PTR(ret);
}

- pdev->driver_override = "imx-scu-clk";
+ ret = driver_set_override(&pdev->dev, &pdev->driver_override,
+ "imx-scu-clk", strlen("imx-scu-clk"));
+ if (ret) {
+ platform_device_put(pdev);
+ return ERR_PTR(ret);
+ }

ret = imx_clk_scu_attach_pd(&pdev->dev, rsrc_id);
if (ret)
--
2.32.0

2022-04-22 21:58:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v7 00/12] Fix broken usage of driver_override (and kfree of static memory)

On Wed, Apr 20, 2022 at 11:20:06AM +0200, Krzysztof Kozlowski wrote:
> On 19/04/2022 13:34, Krzysztof Kozlowski wrote:
>
> Hi Greg, Rafael,
>
> The patchset was for some time on the lists, got some reviews, some
> changes/feedback which I hope I applied/responded.
>
> Entire set depends on the driver core changes, so maybe you could pick
> up everything via drivers core tree?

Ok, will do, thanks.

greg k-h

2022-04-24 02:44:29

by Abel Vesa

[permalink] [raw]
Subject: Re: [PATCH v7 09/12] clk: imx: scu: Fix kfree() of static memory on setting driver_override

On 22-04-19 13:34:32, Krzysztof Kozlowski wrote:
> The driver_override field from platform driver should not be initialized
> from static memory (string literal) because the core later kfree() it,
> for example when driver_override is set via sysfs.
>
> Use dedicated helper to set driver_override properly.
>
> Fixes: 77d8f3068c63 ("clk: imx: scu: add two cells binding support")
> Signed-off-by: Krzysztof Kozlowski <[email protected]>
> Acked-by: Stephen Boyd <[email protected]>

Reviewed-by: Abel Vesa <[email protected]>

> ---
> drivers/clk/imx/clk-scu.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/clk/imx/clk-scu.c b/drivers/clk/imx/clk-scu.c
> index ed3c01d2e8ae..4996f1d94657 100644
> --- a/drivers/clk/imx/clk-scu.c
> +++ b/drivers/clk/imx/clk-scu.c
> @@ -683,7 +683,12 @@ struct clk_hw *imx_clk_scu_alloc_dev(const char *name,
> return ERR_PTR(ret);
> }
>
> - pdev->driver_override = "imx-scu-clk";
> + ret = driver_set_override(&pdev->dev, &pdev->driver_override,
> + "imx-scu-clk", strlen("imx-scu-clk"));
> + if (ret) {
> + platform_device_put(pdev);
> + return ERR_PTR(ret);
> + }
>
> ret = imx_clk_scu_attach_pd(&pdev->dev, rsrc_id);
> if (ret)
> --
> 2.32.0
>

2022-04-29 18:49:17

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v7 12/12] rpmsg: Fix kfree() of static memory on setting driver_override

Hi Krzysztof,

On 19.04.2022 13:34, Krzysztof Kozlowski wrote:
> The driver_override field from platform driver should not be initialized
> from static memory (string literal) because the core later kfree() it,
> for example when driver_override is set via sysfs.
>
> Use dedicated helper to set driver_override properly.
>
> Fixes: 950a7388f02b ("rpmsg: Turn name service into a stand alone driver")
> Fixes: c0cdc19f84a4 ("rpmsg: Driver for user space endpoint interface")
> Signed-off-by: Krzysztof Kozlowski <[email protected]>
> Reviewed-by: Bjorn Andersson <[email protected]>

This patch landed recently in linux-next as commit 42cd402b8fd4 ("rpmsg:
Fix kfree() of static memory on setting driver_override"). In my tests I
found that it triggers the following issue during boot of the
DragonBoard410c SBC (arch/arm64/boot/dts/qcom/apq8016-sbc.dtb):

------------[ cut here ]------------
DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: CPU: 1 PID: 8 at kernel/locking/mutex.c:582
__mutex_lock+0x1ec/0x430
Modules linked in:
CPU: 1 PID: 8 Comm: kworker/u8:0 Not tainted 5.18.0-rc4-next-20220429 #11815
Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __mutex_lock+0x1ec/0x430
lr : __mutex_lock+0x1ec/0x430
..
Call trace:
 __mutex_lock+0x1ec/0x430
 mutex_lock_nested+0x38/0x64
 driver_set_override+0x124/0x150
 qcom_smd_register_edge+0x2a8/0x4ec
 qcom_smd_probe+0x54/0x80
 platform_probe+0x68/0xe0
 really_probe.part.0+0x9c/0x29c
 __driver_probe_device+0x98/0x144
 driver_probe_device+0xac/0x14c
 __device_attach_driver+0xb8/0x120
 bus_for_each_drv+0x78/0xd0
 __device_attach+0xd8/0x180
 device_initial_probe+0x14/0x20
 bus_probe_device+0x9c/0xa4
 deferred_probe_work_func+0x88/0xc4
 process_one_work+0x288/0x6bc
 worker_thread+0x248/0x450
 kthread+0x118/0x11c
 ret_from_fork+0x10/0x20
irq event stamp: 3599
hardirqs last  enabled at (3599): [<ffff80000919053c>]
_raw_spin_unlock_irqrestore+0x98/0x9c
hardirqs last disabled at (3598): [<ffff800009190ba4>]
_raw_spin_lock_irqsave+0xc0/0xcc
softirqs last  enabled at (3554): [<ffff800008010470>] _stext+0x470/0x5e8
softirqs last disabled at (3549): [<ffff8000080a4514>]
__irq_exit_rcu+0x180/0x1ac
---[ end trace 0000000000000000 ]---

I don't see any direct relation between the $subject and the above log,
but reverting the $subject on top of linux next-20220429 hides/fixes it.
Maybe there is a kind of memory trashing somewhere there and your change
only revealed it?

> ---
> drivers/rpmsg/rpmsg_internal.h | 13 +++++++++++--
> drivers/rpmsg/rpmsg_ns.c | 14 ++++++++++++--
> include/linux/rpmsg.h | 6 ++++--
> 3 files changed, 27 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/rpmsg/rpmsg_internal.h b/drivers/rpmsg/rpmsg_internal.h
> index d4b23fd019a8..3e81642238d2 100644
> --- a/drivers/rpmsg/rpmsg_internal.h
> +++ b/drivers/rpmsg/rpmsg_internal.h
> @@ -94,10 +94,19 @@ int rpmsg_release_channel(struct rpmsg_device *rpdev,
> */
> static inline int rpmsg_ctrldev_register_device(struct rpmsg_device *rpdev)
> {
> + int ret;
> +
> strcpy(rpdev->id.name, "rpmsg_ctrl");
> - rpdev->driver_override = "rpmsg_ctrl";
> + ret = driver_set_override(&rpdev->dev, &rpdev->driver_override,
> + rpdev->id.name, strlen(rpdev->id.name));
> + if (ret)
> + return ret;
> +
> + ret = rpmsg_register_device(rpdev);
> + if (ret)
> + kfree(rpdev->driver_override);
>
> - return rpmsg_register_device(rpdev);
> + return ret;
> }
>
> #endif
> diff --git a/drivers/rpmsg/rpmsg_ns.c b/drivers/rpmsg/rpmsg_ns.c
> index 762ff1ae279f..8eb8f328237e 100644
> --- a/drivers/rpmsg/rpmsg_ns.c
> +++ b/drivers/rpmsg/rpmsg_ns.c
> @@ -20,12 +20,22 @@
> */
> int rpmsg_ns_register_device(struct rpmsg_device *rpdev)
> {
> + int ret;
> +
> strcpy(rpdev->id.name, "rpmsg_ns");
> - rpdev->driver_override = "rpmsg_ns";
> + ret = driver_set_override(&rpdev->dev, &rpdev->driver_override,
> + rpdev->id.name, strlen(rpdev->id.name));
> + if (ret)
> + return ret;
> +
> rpdev->src = RPMSG_NS_ADDR;
> rpdev->dst = RPMSG_NS_ADDR;
>
> - return rpmsg_register_device(rpdev);
> + ret = rpmsg_register_device(rpdev);
> + if (ret)
> + kfree(rpdev->driver_override);
> +
> + return ret;
> }
> EXPORT_SYMBOL(rpmsg_ns_register_device);
>
> diff --git a/include/linux/rpmsg.h b/include/linux/rpmsg.h
> index 02fa9116cd60..20c8cd1cde21 100644
> --- a/include/linux/rpmsg.h
> +++ b/include/linux/rpmsg.h
> @@ -41,7 +41,9 @@ struct rpmsg_channel_info {
> * rpmsg_device - device that belong to the rpmsg bus
> * @dev: the device struct
> * @id: device id (used to match between rpmsg drivers and devices)
> - * @driver_override: driver name to force a match
> + * @driver_override: driver name to force a match; do not set directly,
> + * because core frees it; use driver_set_override() to
> + * set or clear it.
> * @src: local address
> * @dst: destination address
> * @ept: the rpmsg endpoint of this channel
> @@ -51,7 +53,7 @@ struct rpmsg_channel_info {
> struct rpmsg_device {
> struct device dev;
> struct rpmsg_device_id id;
> - char *driver_override;
> + const char *driver_override;
> u32 src;
> u32 dst;
> struct rpmsg_endpoint *ept;

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2022-05-02 23:16:16

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v7 12/12] rpmsg: Fix kfree() of static memory on setting driver_override

On 29/04/2022 16:51, Marek Szyprowski wrote:
> On 29.04.2022 16:16, Krzysztof Kozlowski wrote:
>> On 29/04/2022 14:29, Marek Szyprowski wrote:
>>> On 19.04.2022 13:34, Krzysztof Kozlowski wrote:
>>>> The driver_override field from platform driver should not be initialized
>>>> from static memory (string literal) because the core later kfree() it,
>>>> for example when driver_override is set via sysfs.
>>>>
>>>> Use dedicated helper to set driver_override properly.
>>>>
>>>> Fixes: 950a7388f02b ("rpmsg: Turn name service into a stand alone driver")
>>>> Fixes: c0cdc19f84a4 ("rpmsg: Driver for user space endpoint interface")
>>>> Signed-off-by: Krzysztof Kozlowski <[email protected]>
>>>> Reviewed-by: Bjorn Andersson <[email protected]>
>>> This patch landed recently in linux-next as commit 42cd402b8fd4 ("rpmsg:
>>> Fix kfree() of static memory on setting driver_override"). In my tests I
>>> found that it triggers the following issue during boot of the
>>> DragonBoard410c SBC (arch/arm64/boot/dts/qcom/apq8016-sbc.dtb):
>>>
>>> ------------[ cut here ]------------
>>> DEBUG_LOCKS_WARN_ON(lock->magic != lock)
>>> WARNING: CPU: 1 PID: 8 at kernel/locking/mutex.c:582
>>> __mutex_lock+0x1ec/0x430
>>> Modules linked in:
>>> CPU: 1 PID: 8 Comm: kworker/u8:0 Not tainted 5.18.0-rc4-next-20220429 #11815
>>> Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
>>> Workqueue: events_unbound deferred_probe_work_func
>>> pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> pc : __mutex_lock+0x1ec/0x430
>>> lr : __mutex_lock+0x1ec/0x430
>>> ..
>>> Call trace:
>>>  __mutex_lock+0x1ec/0x430
>>>  mutex_lock_nested+0x38/0x64
>>>  driver_set_override+0x124/0x150
>>>  qcom_smd_register_edge+0x2a8/0x4ec
>>>  qcom_smd_probe+0x54/0x80
>>>  platform_probe+0x68/0xe0
>>>  really_probe.part.0+0x9c/0x29c
>>>  __driver_probe_device+0x98/0x144
>>>  driver_probe_device+0xac/0x14c
>>>  __device_attach_driver+0xb8/0x120
>>>  bus_for_each_drv+0x78/0xd0
>>>  __device_attach+0xd8/0x180
>>>  device_initial_probe+0x14/0x20
>>>  bus_probe_device+0x9c/0xa4
>>>  deferred_probe_work_func+0x88/0xc4
>>>  process_one_work+0x288/0x6bc
>>>  worker_thread+0x248/0x450
>>>  kthread+0x118/0x11c
>>>  ret_from_fork+0x10/0x20
>>> irq event stamp: 3599
>>> hardirqs last  enabled at (3599): [<ffff80000919053c>]
>>> _raw_spin_unlock_irqrestore+0x98/0x9c
>>> hardirqs last disabled at (3598): [<ffff800009190ba4>]
>>> _raw_spin_lock_irqsave+0xc0/0xcc
>>> softirqs last  enabled at (3554): [<ffff800008010470>] _stext+0x470/0x5e8
>>> softirqs last disabled at (3549): [<ffff8000080a4514>]
>>> __irq_exit_rcu+0x180/0x1ac
>>> ---[ end trace 0000000000000000 ]---
>>>
>>> I don't see any direct relation between the $subject and the above log,
>>> but reverting the $subject on top of linux next-20220429 hides/fixes it.
>>> Maybe there is a kind of memory trashing somewhere there and your change
>>> only revealed it?
>> Thanks for the report. I think the error path of my patch is wrong - I
>> should not kfree(rpdev->driver_override) from the rpmsg code. That's the
>> only thing I see now...
>>
>> Could you test following patch and tell if it helps?
>> https://pastebin.ubuntu.com/p/rp3q9Z5fXj/
>
> This doesn't help, the issue is still reported.

I think I screwed this part of code. The new helper uses device_lock()
(the mutexes you see in backtrace) but in rpmsg it is called before
device_register() which initializes the device.

I don't have a device using qcom-smd rpmsg, so it's a bit tricky to
reproduce.

Best regards,
Krzysztof

2022-05-02 23:22:07

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v7 12/12] rpmsg: Fix kfree() of static memory on setting driver_override

On 29.04.2022 16:16, Krzysztof Kozlowski wrote:
> On 29/04/2022 14:29, Marek Szyprowski wrote:
>> On 19.04.2022 13:34, Krzysztof Kozlowski wrote:
>>> The driver_override field from platform driver should not be initialized
>>> from static memory (string literal) because the core later kfree() it,
>>> for example when driver_override is set via sysfs.
>>>
>>> Use dedicated helper to set driver_override properly.
>>>
>>> Fixes: 950a7388f02b ("rpmsg: Turn name service into a stand alone driver")
>>> Fixes: c0cdc19f84a4 ("rpmsg: Driver for user space endpoint interface")
>>> Signed-off-by: Krzysztof Kozlowski <[email protected]>
>>> Reviewed-by: Bjorn Andersson <[email protected]>
>> This patch landed recently in linux-next as commit 42cd402b8fd4 ("rpmsg:
>> Fix kfree() of static memory on setting driver_override"). In my tests I
>> found that it triggers the following issue during boot of the
>> DragonBoard410c SBC (arch/arm64/boot/dts/qcom/apq8016-sbc.dtb):
>>
>> ------------[ cut here ]------------
>> DEBUG_LOCKS_WARN_ON(lock->magic != lock)
>> WARNING: CPU: 1 PID: 8 at kernel/locking/mutex.c:582
>> __mutex_lock+0x1ec/0x430
>> Modules linked in:
>> CPU: 1 PID: 8 Comm: kworker/u8:0 Not tainted 5.18.0-rc4-next-20220429 #11815
>> Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
>> Workqueue: events_unbound deferred_probe_work_func
>> pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : __mutex_lock+0x1ec/0x430
>> lr : __mutex_lock+0x1ec/0x430
>> ..
>> Call trace:
>>  __mutex_lock+0x1ec/0x430
>>  mutex_lock_nested+0x38/0x64
>>  driver_set_override+0x124/0x150
>>  qcom_smd_register_edge+0x2a8/0x4ec
>>  qcom_smd_probe+0x54/0x80
>>  platform_probe+0x68/0xe0
>>  really_probe.part.0+0x9c/0x29c
>>  __driver_probe_device+0x98/0x144
>>  driver_probe_device+0xac/0x14c
>>  __device_attach_driver+0xb8/0x120
>>  bus_for_each_drv+0x78/0xd0
>>  __device_attach+0xd8/0x180
>>  device_initial_probe+0x14/0x20
>>  bus_probe_device+0x9c/0xa4
>>  deferred_probe_work_func+0x88/0xc4
>>  process_one_work+0x288/0x6bc
>>  worker_thread+0x248/0x450
>>  kthread+0x118/0x11c
>>  ret_from_fork+0x10/0x20
>> irq event stamp: 3599
>> hardirqs last  enabled at (3599): [<ffff80000919053c>]
>> _raw_spin_unlock_irqrestore+0x98/0x9c
>> hardirqs last disabled at (3598): [<ffff800009190ba4>]
>> _raw_spin_lock_irqsave+0xc0/0xcc
>> softirqs last  enabled at (3554): [<ffff800008010470>] _stext+0x470/0x5e8
>> softirqs last disabled at (3549): [<ffff8000080a4514>]
>> __irq_exit_rcu+0x180/0x1ac
>> ---[ end trace 0000000000000000 ]---
>>
>> I don't see any direct relation between the $subject and the above log,
>> but reverting the $subject on top of linux next-20220429 hides/fixes it.
>> Maybe there is a kind of memory trashing somewhere there and your change
>> only revealed it?
> Thanks for the report. I think the error path of my patch is wrong - I
> should not kfree(rpdev->driver_override) from the rpmsg code. That's the
> only thing I see now...
>
> Could you test following patch and tell if it helps?
> https://pastebin.ubuntu.com/p/rp3q9Z5fXj/

This doesn't help, the issue is still reported.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2022-05-03 00:42:54

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v7 12/12] rpmsg: Fix kfree() of static memory on setting driver_override

On 29/04/2022 14:29, Marek Szyprowski wrote:
> Hi Krzysztof,
>
> On 19.04.2022 13:34, Krzysztof Kozlowski wrote:
>> The driver_override field from platform driver should not be initialized
>> from static memory (string literal) because the core later kfree() it,
>> for example when driver_override is set via sysfs.
>>
>> Use dedicated helper to set driver_override properly.
>>
>> Fixes: 950a7388f02b ("rpmsg: Turn name service into a stand alone driver")
>> Fixes: c0cdc19f84a4 ("rpmsg: Driver for user space endpoint interface")
>> Signed-off-by: Krzysztof Kozlowski <[email protected]>
>> Reviewed-by: Bjorn Andersson <[email protected]>
>
> This patch landed recently in linux-next as commit 42cd402b8fd4 ("rpmsg:
> Fix kfree() of static memory on setting driver_override"). In my tests I
> found that it triggers the following issue during boot of the
> DragonBoard410c SBC (arch/arm64/boot/dts/qcom/apq8016-sbc.dtb):
>
> ------------[ cut here ]------------
> DEBUG_LOCKS_WARN_ON(lock->magic != lock)
> WARNING: CPU: 1 PID: 8 at kernel/locking/mutex.c:582
> __mutex_lock+0x1ec/0x430
> Modules linked in:
> CPU: 1 PID: 8 Comm: kworker/u8:0 Not tainted 5.18.0-rc4-next-20220429 #11815
> Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> Workqueue: events_unbound deferred_probe_work_func
> pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : __mutex_lock+0x1ec/0x430
> lr : __mutex_lock+0x1ec/0x430
> ..
> Call trace:
>  __mutex_lock+0x1ec/0x430
>  mutex_lock_nested+0x38/0x64
>  driver_set_override+0x124/0x150
>  qcom_smd_register_edge+0x2a8/0x4ec
>  qcom_smd_probe+0x54/0x80
>  platform_probe+0x68/0xe0
>  really_probe.part.0+0x9c/0x29c
>  __driver_probe_device+0x98/0x144
>  driver_probe_device+0xac/0x14c
>  __device_attach_driver+0xb8/0x120
>  bus_for_each_drv+0x78/0xd0
>  __device_attach+0xd8/0x180
>  device_initial_probe+0x14/0x20
>  bus_probe_device+0x9c/0xa4
>  deferred_probe_work_func+0x88/0xc4
>  process_one_work+0x288/0x6bc
>  worker_thread+0x248/0x450
>  kthread+0x118/0x11c
>  ret_from_fork+0x10/0x20
> irq event stamp: 3599
> hardirqs last  enabled at (3599): [<ffff80000919053c>]
> _raw_spin_unlock_irqrestore+0x98/0x9c
> hardirqs last disabled at (3598): [<ffff800009190ba4>]
> _raw_spin_lock_irqsave+0xc0/0xcc
> softirqs last  enabled at (3554): [<ffff800008010470>] _stext+0x470/0x5e8
> softirqs last disabled at (3549): [<ffff8000080a4514>]
> __irq_exit_rcu+0x180/0x1ac
> ---[ end trace 0000000000000000 ]---
>
> I don't see any direct relation between the $subject and the above log,
> but reverting the $subject on top of linux next-20220429 hides/fixes it.
> Maybe there is a kind of memory trashing somewhere there and your change
> only revealed it?

Thanks for the report. I think the error path of my patch is wrong - I
should not kfree(rpdev->driver_override) from the rpmsg code. That's the
only thing I see now...

Could you test following patch and tell if it helps?
https://pastebin.ubuntu.com/p/rp3q9Z5fXj/

-----

diff --git a/drivers/rpmsg/rpmsg_internal.h b/drivers/rpmsg/rpmsg_internal.h
index 3e81642238d2..1e2ad944e2ec 100644
--- a/drivers/rpmsg/rpmsg_internal.h
+++ b/drivers/rpmsg/rpmsg_internal.h
@@ -102,11 +102,7 @@ static inline int
rpmsg_ctrldev_register_device(struct rpmsg_device *rpdev)
if (ret)
return ret;

- ret = rpmsg_register_device(rpdev);
- if (ret)
- kfree(rpdev->driver_override);
-
- return ret;
+ return rpmsg_register_device(rpdev);
}

#endif
diff --git a/drivers/rpmsg/rpmsg_ns.c b/drivers/rpmsg/rpmsg_ns.c
index 8eb8f328237e..f26078467899 100644
--- a/drivers/rpmsg/rpmsg_ns.c
+++ b/drivers/rpmsg/rpmsg_ns.c
@@ -31,11 +31,7 @@ int rpmsg_ns_register_device(struct rpmsg_device *rpdev)
rpdev->src = RPMSG_NS_ADDR;
rpdev->dst = RPMSG_NS_ADDR;

- ret = rpmsg_register_device(rpdev);
- if (ret)
- kfree(rpdev->driver_override);
-
- return ret;
+ return rpmsg_register_device(rpdev);
}
EXPORT_SYMBOL(rpmsg_ns_register_device);