2023-10-05 14:37:03

by Moshe Shemesh

[permalink] [raw]
Subject: [PATCH net v2] devlink: Hold devlink lock on health reporter dump get

Devlink health dump get callback should take devlink lock as any other
devlink callback. Otherwise, since devlink_mutex was removed, this
callback is not protected from a race of the reporter being destroyed
while handling the callback.

Add devlink lock to the callback and to any call for
devlink_health_do_dump(). This should be safe as non of the drivers dump
callback implementation takes devlink lock.

As devlink lock is added to any callback of dump, the reporter dump_lock
is now redundant and can be removed.

Fixes: d3efc2a6a6d8 ("net: devlink: remove devlink_mutex")
Signed-off-by: Moshe Shemesh <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
---
v1->v2:
- Commit message, added why this fix is necessary and why its safe to do.
---
net/devlink/health.c | 30 ++++++++++++++++--------------
1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/net/devlink/health.c b/net/devlink/health.c
index 638cad8d5c65..51e6e81e31bb 100644
--- a/net/devlink/health.c
+++ b/net/devlink/health.c
@@ -58,7 +58,6 @@ struct devlink_health_reporter {
struct devlink *devlink;
struct devlink_port *devlink_port;
struct devlink_fmsg *dump_fmsg;
- struct mutex dump_lock; /* lock parallel read/write from dump buffers */
u64 graceful_period;
bool auto_recover;
bool auto_dump;
@@ -125,7 +124,6 @@ __devlink_health_reporter_create(struct devlink *devlink,
reporter->graceful_period = graceful_period;
reporter->auto_recover = !!ops->recover;
reporter->auto_dump = !!ops->dump;
- mutex_init(&reporter->dump_lock);
return reporter;
}

@@ -226,7 +224,6 @@ EXPORT_SYMBOL_GPL(devlink_health_reporter_create);
static void
devlink_health_reporter_free(struct devlink_health_reporter *reporter)
{
- mutex_destroy(&reporter->dump_lock);
if (reporter->dump_fmsg)
devlink_fmsg_free(reporter->dump_fmsg);
kfree(reporter);
@@ -625,10 +622,10 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
}

if (reporter->auto_dump) {
- mutex_lock(&reporter->dump_lock);
+ devl_lock(devlink);
/* store current dump of current error, for later analysis */
devlink_health_do_dump(reporter, priv_ctx, NULL);
- mutex_unlock(&reporter->dump_lock);
+ devl_unlock(devlink);
}

if (!reporter->auto_recover)
@@ -1262,7 +1259,7 @@ int devlink_nl_cmd_health_reporter_diagnose_doit(struct sk_buff *skb,
}

static struct devlink_health_reporter *
-devlink_health_reporter_get_from_cb(struct netlink_callback *cb)
+devlink_health_reporter_get_from_cb_lock(struct netlink_callback *cb)
{
const struct genl_info *info = genl_info_dump(cb);
struct devlink_health_reporter *reporter;
@@ -1272,10 +1269,12 @@ devlink_health_reporter_get_from_cb(struct netlink_callback *cb)
devlink = devlink_get_from_attrs_lock(sock_net(cb->skb->sk), attrs);
if (IS_ERR(devlink))
return NULL;
- devl_unlock(devlink);

reporter = devlink_health_reporter_get_from_attrs(devlink, attrs);
- devlink_put(devlink);
+ if (!reporter) {
+ devl_unlock(devlink);
+ devlink_put(devlink);
+ }
return reporter;
}

@@ -1284,16 +1283,20 @@ int devlink_nl_cmd_health_reporter_dump_get_dumpit(struct sk_buff *skb,
{
struct devlink_nl_dump_state *state = devlink_dump_state(cb);
struct devlink_health_reporter *reporter;
+ struct devlink *devlink;
int err;

- reporter = devlink_health_reporter_get_from_cb(cb);
+ reporter = devlink_health_reporter_get_from_cb_lock(cb);
if (!reporter)
return -EINVAL;

- if (!reporter->ops->dump)
+ devlink = reporter->devlink;
+ if (!reporter->ops->dump) {
+ devl_unlock(devlink);
+ devlink_put(devlink);
return -EOPNOTSUPP;
+ }

- mutex_lock(&reporter->dump_lock);
if (!state->idx) {
err = devlink_health_do_dump(reporter, NULL, cb->extack);
if (err)
@@ -1309,7 +1312,8 @@ int devlink_nl_cmd_health_reporter_dump_get_dumpit(struct sk_buff *skb,
err = devlink_fmsg_dumpit(reporter->dump_fmsg, skb, cb,
DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET);
unlock:
- mutex_unlock(&reporter->dump_lock);
+ devl_unlock(devlink);
+ devlink_put(devlink);
return err;
}

@@ -1326,9 +1330,7 @@ int devlink_nl_cmd_health_reporter_dump_clear_doit(struct sk_buff *skb,
if (!reporter->ops->dump)
return -EOPNOTSUPP;

- mutex_lock(&reporter->dump_lock);
devlink_health_dump_clear(reporter);
- mutex_unlock(&reporter->dump_lock);
return 0;
}

--
2.27.0


2023-10-05 14:48:45

by Przemek Kitszel

[permalink] [raw]
Subject: Re: [PATCH net v2] devlink: Hold devlink lock on health reporter dump get

On 10/5/23 14:50, Moshe Shemesh wrote:
> Devlink health dump get callback should take devlink lock as any other
> devlink callback. Otherwise, since devlink_mutex was removed, this
> callback is not protected from a race of the reporter being destroyed
> while handling the callback.
>
> Add devlink lock to the callback and to any call for
> devlink_health_do_dump(). This should be safe as non of the drivers dump
> callback implementation takes devlink lock.
>
> As devlink lock is added to any callback of dump, the reporter dump_lock
> is now redundant and can be removed.
>
> Fixes: d3efc2a6a6d8 ("net: devlink: remove devlink_mutex")
> Signed-off-by: Moshe Shemesh <[email protected]>
> Reviewed-by: Jiri Pirko <[email protected]>
> ---
> v1->v2:
> - Commit message, added why this fix is necessary and why its safe to do.
> ---
> net/devlink/health.c | 30 ++++++++++++++++--------------
> 1 file changed, 16 insertions(+), 14 deletions(-)
>

Reviewed-by: Przemek Kitszel <[email protected]>


2023-10-06 23:01:56

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net v2] devlink: Hold devlink lock on health reporter dump get

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <[email protected]>:

On Thu, 5 Oct 2023 15:50:16 +0300 you wrote:
> Devlink health dump get callback should take devlink lock as any other
> devlink callback. Otherwise, since devlink_mutex was removed, this
> callback is not protected from a race of the reporter being destroyed
> while handling the callback.
>
> Add devlink lock to the callback and to any call for
> devlink_health_do_dump(). This should be safe as non of the drivers dump
> callback implementation takes devlink lock.
>
> [...]

Here is the summary with links:
- [net,v2] devlink: Hold devlink lock on health reporter dump get
https://git.kernel.org/netdev/net/c/aba0e909dc20

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html