From: Leon Romanovsky <[email protected]>
Hi,
Jakub's request to make sure that devlink events are delayed and not
printed till they fully accessible [1] requires us to implement delayed
event notification system in the devlink.
In order to do it, I moved some of my patches (xarray e.t.c) from the future
series to be before "Move devlink_register to be near devlink_reload_enable" [2].
That allows us to rely on DEVLINK_REGISTERED xarray mark to decide if to print
event or not.
Other patches are simple cleanup which is needed anyway.
[1] https://lore.kernel.org/lkml/20210811071817.4af5ab34@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com
[2] https://lore.kernel.org/lkml/[email protected]
Next in the queue:
* Delay event series
* Move devlink_register to be near devlink_reload_enable"
* Extension of devlink_ops to be set dynamically
* devlink_reload_* delete
* Devlink locks rework to user xarray and reference counting
* ????
Thanks
Leon Romanovsky (6):
devlink: Simplify devlink_pernet_pre_exit call
devlink: Remove check of always valid devlink pointer
devlink: Count struct devlink consumers
devlink: Use xarray to store devlink instances
devlink: Clear whole devlink_flash_notify struct
net: hns3: remove always exist devlink pointer check
.../hisilicon/hns3/hns3pf/hclge_devlink.c | 8 +-
.../hisilicon/hns3/hns3vf/hclgevf_devlink.c | 8 +-
include/net/devlink.h | 4 +-
net/core/devlink.c | 391 ++++++++++++------
4 files changed, 273 insertions(+), 138 deletions(-)
--
2.31.1
From: Leon Romanovsky <[email protected]>
The devlink_pernet_pre_exit() will be called if net namespace exits.
That routine is relevant for devlink instances that were assigned to
that namespaces first. This assignment is possible only with the following
command: "devlink reload DEV netns ...", which already checks reload support.
Signed-off-by: Leon Romanovsky <[email protected]>
---
net/core/devlink.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/net/core/devlink.c b/net/core/devlink.c
index ee9787314cff..9e74a95b3322 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -11392,16 +11392,16 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
*/
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (net_eq(devlink_net(devlink), net)) {
- if (WARN_ON(!devlink_reload_supported(devlink->ops)))
- continue;
- err = devlink_reload(devlink, &init_net,
- DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
- DEVLINK_RELOAD_LIMIT_UNSPEC,
- &actions_performed, NULL);
- if (err && err != -EOPNOTSUPP)
- pr_warn("Failed to reload devlink instance into init_net\n");
- }
+ if (!net_eq(devlink_net(devlink), net))
+ continue;
+
+ WARN_ON(!devlink_reload_supported(devlink->ops));
+ err = devlink_reload(devlink, &init_net,
+ DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
+ DEVLINK_RELOAD_LIMIT_UNSPEC,
+ &actions_performed, NULL);
+ if (err && err != -EOPNOTSUPP)
+ pr_warn("Failed to reload devlink instance into init_net\n");
}
mutex_unlock(&devlink_mutex);
}
--
2.31.1
From: Leon Romanovsky <[email protected]>
The { 0 } doesn't clear all fields in the struct, but tells to the
compiler to set all fields to zero and doesn't touch any sub-fields
if they exists.
The {} is an empty initialiser that instructs to fully initialize whole
struct including sub-fields, which is error-prone for future
devlink_flash_notify extensions.
Fixes: 6700acc5f1fe ("devlink: collect flash notify params into a struct")
Signed-off-by: Leon Romanovsky <[email protected]>
---
net/core/devlink.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/core/devlink.c b/net/core/devlink.c
index d218f57ad8cf..a856ae401ea5 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -4169,7 +4169,7 @@ static void __devlink_flash_update_notify(struct devlink *devlink,
static void devlink_flash_update_begin_notify(struct devlink *devlink)
{
- struct devlink_flash_notify params = { 0 };
+ struct devlink_flash_notify params = {};
__devlink_flash_update_notify(devlink,
DEVLINK_CMD_FLASH_UPDATE,
@@ -4178,7 +4178,7 @@ static void devlink_flash_update_begin_notify(struct devlink *devlink)
static void devlink_flash_update_end_notify(struct devlink *devlink)
{
- struct devlink_flash_notify params = { 0 };
+ struct devlink_flash_notify params = {};
__devlink_flash_update_notify(devlink,
DEVLINK_CMD_FLASH_UPDATE_END,
--
2.31.1
From: Leon Romanovsky <[email protected]>
We can use xarray instead of linearly organized linked lists for the
devlink instances. This will let us revise the locking scheme in favour
of internal xarray locking that protects database.
Signed-off-by: Leon Romanovsky <[email protected]>
---
include/net/devlink.h | 2 +-
net/core/devlink.c | 70 ++++++++++++++++++++++++++++++-------------
2 files changed, 50 insertions(+), 22 deletions(-)
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 4c60d61d92da..154cf0dbca37 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -32,7 +32,7 @@ struct devlink_dev_stats {
struct devlink_ops;
struct devlink {
- struct list_head list;
+ u32 index;
struct list_head port_list;
struct list_head rate_list;
struct list_head sb_list;
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 76f459da6e05..d218f57ad8cf 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -92,7 +92,8 @@ static const struct nla_policy devlink_function_nl_policy[DEVLINK_PORT_FUNCTION_
DEVLINK_PORT_FN_STATE_ACTIVE),
};
-static LIST_HEAD(devlink_list);
+static DEFINE_XARRAY_FLAGS(devlinks, XA_FLAGS_ALLOC);
+#define DEVLINK_REGISTERED XA_MARK_1
/* devlink_mutex
*
@@ -123,6 +124,7 @@ static struct devlink *devlink_get_from_attrs(struct net *net,
struct nlattr **attrs)
{
struct devlink *devlink;
+ unsigned long index;
bool found = false;
char *busname;
char *devname;
@@ -135,7 +137,7 @@ static struct devlink *devlink_get_from_attrs(struct net *net,
lockdep_assert_held(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (strcmp(devlink->dev->bus->name, busname) == 0 &&
strcmp(dev_name(devlink->dev), devname) == 0 &&
net_eq(devlink_net(devlink), net)) {
@@ -1087,11 +1089,12 @@ static int devlink_nl_cmd_rate_get_dumpit(struct sk_buff *msg,
struct devlink_rate *devlink_rate;
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err = 0;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -1189,11 +1192,12 @@ static int devlink_nl_cmd_get_dumpit(struct sk_buff *msg,
{
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -1251,11 +1255,12 @@ static int devlink_nl_cmd_port_get_dumpit(struct sk_buff *msg,
struct devlink *devlink;
struct devlink_port *devlink_port;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -1916,11 +1921,12 @@ static int devlink_nl_cmd_sb_get_dumpit(struct sk_buff *msg,
struct devlink *devlink;
struct devlink_sb *devlink_sb;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -2067,11 +2073,12 @@ static int devlink_nl_cmd_sb_pool_get_dumpit(struct sk_buff *msg,
struct devlink *devlink;
struct devlink_sb *devlink_sb;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err = 0;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -2287,11 +2294,12 @@ static int devlink_nl_cmd_sb_port_pool_get_dumpit(struct sk_buff *msg,
struct devlink *devlink;
struct devlink_sb *devlink_sb;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err = 0;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -2535,11 +2543,12 @@ devlink_nl_cmd_sb_tc_pool_bind_get_dumpit(struct sk_buff *msg,
struct devlink *devlink;
struct devlink_sb *devlink_sb;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err = 0;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -4611,11 +4620,12 @@ static int devlink_nl_cmd_param_get_dumpit(struct sk_buff *msg,
struct devlink_param_item *param_item;
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err = 0;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -4886,11 +4896,12 @@ static int devlink_nl_cmd_port_param_get_dumpit(struct sk_buff *msg,
struct devlink_port *devlink_port;
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err = 0;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -5462,11 +5473,12 @@ static int devlink_nl_cmd_region_get_dumpit(struct sk_buff *msg,
{
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err = 0;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -5995,11 +6007,12 @@ static int devlink_nl_cmd_info_get_dumpit(struct sk_buff *msg,
{
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err = 0;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -7176,11 +7189,12 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
struct devlink_port *port;
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -7210,7 +7224,7 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
devlink_put(devlink);
}
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -7771,11 +7785,12 @@ static int devlink_nl_cmd_trap_get_dumpit(struct sk_buff *msg,
struct devlink_trap_item *trap_item;
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -7997,11 +8012,12 @@ static int devlink_nl_cmd_trap_group_get_dumpit(struct sk_buff *msg,
u32 portid = NETLINK_CB(cb->skb).portid;
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -8310,11 +8326,12 @@ static int devlink_nl_cmd_trap_policer_get_dumpit(struct sk_buff *msg,
u32 portid = NETLINK_CB(cb->skb).portid;
struct devlink *devlink;
int start = cb->args[0];
+ unsigned long index;
int idx = 0;
int err;
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
@@ -8899,6 +8916,8 @@ struct devlink *devlink_alloc_ns(const struct devlink_ops *ops,
struct device *dev)
{
struct devlink *devlink;
+ static u32 last_id;
+ int ret;
WARN_ON(!ops || !dev);
if (!devlink_reload_actions_valid(ops))
@@ -8908,6 +8927,13 @@ struct devlink *devlink_alloc_ns(const struct devlink_ops *ops,
if (!devlink)
return NULL;
+ ret = xa_alloc_cyclic(&devlinks, &devlink->index, devlink, xa_limit_31b,
+ &last_id, GFP_KERNEL);
+ if (ret < 0) {
+ kfree(devlink);
+ return NULL;
+ }
+
devlink->dev = dev;
devlink->ops = ops;
xa_init_flags(&devlink->snapshot_ids, XA_FLAGS_ALLOC);
@@ -8940,7 +8966,7 @@ EXPORT_SYMBOL_GPL(devlink_alloc_ns);
int devlink_register(struct devlink *devlink)
{
mutex_lock(&devlink_mutex);
- list_add_tail(&devlink->list, &devlink_list);
+ xa_set_mark(&devlinks, devlink->index, DEVLINK_REGISTERED);
devlink_notify(devlink, DEVLINK_CMD_NEW);
mutex_unlock(&devlink_mutex);
return 0;
@@ -8961,7 +8987,7 @@ void devlink_unregister(struct devlink *devlink)
WARN_ON(devlink_reload_supported(devlink->ops) &&
devlink->reload_enabled);
devlink_notify(devlink, DEVLINK_CMD_DEL);
- list_del(&devlink->list);
+ xa_clear_mark(&devlinks, devlink->index, DEVLINK_REGISTERED);
mutex_unlock(&devlink_mutex);
}
EXPORT_SYMBOL_GPL(devlink_unregister);
@@ -9023,6 +9049,7 @@ void devlink_free(struct devlink *devlink)
WARN_ON(!list_empty(&devlink->port_list));
xa_destroy(&devlink->snapshot_ids);
+ xa_erase(&devlinks, devlink->index);
kfree(devlink);
}
@@ -11497,13 +11524,14 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
{
struct devlink *devlink;
u32 actions_performed;
+ unsigned long index;
int err;
/* In case network namespace is getting destroyed, reload
* all devlink instances from this namespace into init_net.
*/
mutex_lock(&devlink_mutex);
- list_for_each_entry(devlink, &devlink_list, list) {
+ xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
if (!devlink_try_get(devlink))
continue;
--
2.31.1
From: Leon Romanovsky <[email protected]>
The struct devlink itself is protected by internal lock and doesn't
need global lock during operation. That global lock is used to protect
addition/removal new devlink instances from the global list in use by
all devlink consumers in the system.
The future conversion of linked list to be xarray will allow us to
actually delete that lock, but first we need to count all struct devlink
users.
The reference counting provides us a way to ensure that no new user
space commands success to grab devlink instance which is going to be
destroyed makes it is safe to access it without lock.
Signed-off-by: Leon Romanovsky <[email protected]>
---
include/net/devlink.h | 2 +
net/core/devlink.c | 205 ++++++++++++++++++++++++++++++++++--------
2 files changed, 172 insertions(+), 35 deletions(-)
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 1151497c0ec5..4c60d61d92da 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -56,6 +56,8 @@ struct devlink {
*/
u8 reload_failed:1,
reload_enabled:1;
+ refcount_t refcount;
+ struct completion comp;
char priv[0] __aligned(NETDEV_ALIGN);
};
diff --git a/net/core/devlink.c b/net/core/devlink.c
index c8a8eecad1c5..76f459da6e05 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -108,10 +108,22 @@ struct net *devlink_net(const struct devlink *devlink)
}
EXPORT_SYMBOL_GPL(devlink_net);
+static void devlink_put(struct devlink *devlink)
+{
+ if (refcount_dec_and_test(&devlink->refcount))
+ complete(&devlink->comp);
+}
+
+static bool __must_check devlink_try_get(struct devlink *devlink)
+{
+ return refcount_inc_not_zero(&devlink->refcount);
+}
+
static struct devlink *devlink_get_from_attrs(struct net *net,
struct nlattr **attrs)
{
struct devlink *devlink;
+ bool found = false;
char *busname;
char *devname;
@@ -126,16 +138,16 @@ static struct devlink *devlink_get_from_attrs(struct net *net,
list_for_each_entry(devlink, &devlink_list, list) {
if (strcmp(devlink->dev->bus->name, busname) == 0 &&
strcmp(dev_name(devlink->dev), devname) == 0 &&
- net_eq(devlink_net(devlink), net))
- return devlink;
+ net_eq(devlink_net(devlink), net)) {
+ found = true;
+ break;
+ }
}
- return ERR_PTR(-ENODEV);
-}
+ if (!found || !devlink_try_get(devlink))
+ devlink = ERR_PTR(-ENODEV);
-static struct devlink *devlink_get_from_info(struct genl_info *info)
-{
- return devlink_get_from_attrs(genl_info_net(info), info->attrs);
+ return devlink;
}
static struct devlink_port *devlink_port_get_by_index(struct devlink *devlink,
@@ -486,7 +498,7 @@ static int devlink_nl_pre_doit(const struct genl_ops *ops,
int err;
mutex_lock(&devlink_mutex);
- devlink = devlink_get_from_info(info);
+ devlink = devlink_get_from_attrs(genl_info_net(info), info->attrs);
if (IS_ERR(devlink)) {
mutex_unlock(&devlink_mutex);
return PTR_ERR(devlink);
@@ -529,6 +541,7 @@ static int devlink_nl_pre_doit(const struct genl_ops *ops,
unlock:
if (~ops->internal_flags & DEVLINK_NL_FLAG_NO_LOCK)
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
mutex_unlock(&devlink_mutex);
return err;
}
@@ -541,6 +554,7 @@ static void devlink_nl_post_doit(const struct genl_ops *ops,
devlink = info->user_ptr[0];
if (~ops->internal_flags & DEVLINK_NL_FLAG_NO_LOCK)
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
mutex_unlock(&devlink_mutex);
}
@@ -1078,8 +1092,12 @@ static int devlink_nl_cmd_rate_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(devlink_rate, &devlink->rate_list, list) {
enum devlink_command cmd = DEVLINK_CMD_RATE_NEW;
@@ -1094,11 +1112,14 @@ static int devlink_nl_cmd_rate_get_dumpit(struct sk_buff *msg,
NLM_F_MULTI, NULL);
if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
idx++;
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -1173,15 +1194,24 @@ static int devlink_nl_cmd_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk))) {
+ devlink_put(devlink);
+ continue;
+ }
+
if (idx < start) {
idx++;
+ devlink_put(devlink);
continue;
}
+
err = devlink_nl_fill(msg, devlink, DEVLINK_CMD_NEW,
NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq, NLM_F_MULTI);
+ devlink_put(devlink);
if (err)
goto out;
idx++;
@@ -1226,8 +1256,12 @@ static int devlink_nl_cmd_port_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(devlink_port, &devlink->port_list, list) {
if (idx < start) {
@@ -1241,11 +1275,14 @@ static int devlink_nl_cmd_port_get_dumpit(struct sk_buff *msg,
NLM_F_MULTI, cb->extack);
if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
idx++;
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -1884,8 +1921,12 @@ static int devlink_nl_cmd_sb_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(devlink_sb, &devlink->sb_list, list) {
if (idx < start) {
@@ -1899,11 +1940,14 @@ static int devlink_nl_cmd_sb_get_dumpit(struct sk_buff *msg,
NLM_F_MULTI);
if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
idx++;
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -2028,9 +2072,13 @@ static int devlink_nl_cmd_sb_pool_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
+ if (!devlink_try_get(devlink))
+ continue;
+
if (!net_eq(devlink_net(devlink), sock_net(msg->sk)) ||
!devlink->ops->sb_pool_get)
- continue;
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(devlink_sb, &devlink->sb_list, list) {
err = __sb_pool_get_dumpit(msg, start, &idx, devlink,
@@ -2041,10 +2089,13 @@ static int devlink_nl_cmd_sb_pool_get_dumpit(struct sk_buff *msg,
err = 0;
} else if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -2241,9 +2292,13 @@ static int devlink_nl_cmd_sb_port_pool_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
+ if (!devlink_try_get(devlink))
+ continue;
+
if (!net_eq(devlink_net(devlink), sock_net(msg->sk)) ||
!devlink->ops->sb_port_pool_get)
- continue;
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(devlink_sb, &devlink->sb_list, list) {
err = __sb_port_pool_get_dumpit(msg, start, &idx,
@@ -2254,10 +2309,13 @@ static int devlink_nl_cmd_sb_port_pool_get_dumpit(struct sk_buff *msg,
err = 0;
} else if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -2482,9 +2540,12 @@ devlink_nl_cmd_sb_tc_pool_bind_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
+ if (!devlink_try_get(devlink))
+ continue;
+
if (!net_eq(devlink_net(devlink), sock_net(msg->sk)) ||
!devlink->ops->sb_tc_pool_bind_get)
- continue;
+ goto retry;
mutex_lock(&devlink->lock);
list_for_each_entry(devlink_sb, &devlink->sb_list, list) {
@@ -2497,10 +2558,13 @@ devlink_nl_cmd_sb_tc_pool_bind_get_dumpit(struct sk_buff *msg,
err = 0;
} else if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -4552,8 +4616,12 @@ static int devlink_nl_cmd_param_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(param_item, &devlink->param_list, list) {
if (idx < start) {
@@ -4569,11 +4637,14 @@ static int devlink_nl_cmd_param_get_dumpit(struct sk_buff *msg,
err = 0;
} else if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
idx++;
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -4820,8 +4891,12 @@ static int devlink_nl_cmd_port_param_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(devlink_port, &devlink->port_list, list) {
list_for_each_entry(param_item,
@@ -4841,12 +4916,15 @@ static int devlink_nl_cmd_port_param_get_dumpit(struct sk_buff *msg,
err = 0;
} else if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
idx++;
}
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -5385,14 +5463,20 @@ static int devlink_nl_cmd_region_get_dumpit(struct sk_buff *msg,
struct devlink *devlink;
int start = cb->args[0];
int idx = 0;
- int err;
+ int err = 0;
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
err = devlink_nl_cmd_region_get_devlink_dumpit(msg, cb, devlink,
&idx, start);
+retry:
+ devlink_put(devlink);
if (err)
goto out;
}
@@ -5755,6 +5839,7 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
nla_nest_end(skb, chunks_attr);
genlmsg_end(skb, hdr);
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
mutex_unlock(&devlink_mutex);
return skb->len;
@@ -5763,6 +5848,7 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
genlmsg_cancel(skb, hdr);
out_unlock:
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
out_dev:
mutex_unlock(&devlink_mutex);
return err;
@@ -5914,17 +6000,14 @@ static int devlink_nl_cmd_info_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
- if (idx < start) {
- idx++;
- continue;
- }
- if (!devlink->ops->info_get) {
- idx++;
- continue;
- }
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
+ if (idx < start || !devlink->ops->info_get)
+ goto inc;
mutex_lock(&devlink->lock);
err = devlink_nl_info_fill(msg, devlink, DEVLINK_CMD_INFO_GET,
@@ -5934,9 +6017,14 @@ static int devlink_nl_cmd_info_get_dumpit(struct sk_buff *msg,
mutex_unlock(&devlink->lock);
if (err == -EOPNOTSUPP)
err = 0;
- else if (err)
+ else if (err) {
+ devlink_put(devlink);
break;
+ }
+inc:
idx++;
+retry:
+ devlink_put(devlink);
}
mutex_unlock(&devlink_mutex);
@@ -7021,6 +7109,7 @@ devlink_health_reporter_get_from_cb(struct netlink_callback *cb)
goto unlock;
reporter = devlink_health_reporter_get_from_attrs(devlink, attrs);
+ devlink_put(devlink);
mutex_unlock(&devlink_mutex);
return reporter;
unlock:
@@ -7092,8 +7181,12 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry_rep;
+
mutex_lock(&devlink->reporters_lock);
list_for_each_entry(reporter, &devlink->reporter_list,
list) {
@@ -7107,16 +7200,23 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
NLM_F_MULTI);
if (err) {
mutex_unlock(&devlink->reporters_lock);
+ devlink_put(devlink);
goto out;
}
idx++;
}
mutex_unlock(&devlink->reporters_lock);
+retry_rep:
+ devlink_put(devlink);
}
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry_port;
+
mutex_lock(&devlink->lock);
list_for_each_entry(port, &devlink->port_list, list) {
mutex_lock(&port->reporters_lock);
@@ -7133,6 +7233,7 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
if (err) {
mutex_unlock(&port->reporters_lock);
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
idx++;
@@ -7140,6 +7241,8 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct sk_buff *msg,
mutex_unlock(&port->reporters_lock);
}
mutex_unlock(&devlink->lock);
+retry_port:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -7673,8 +7776,12 @@ static int devlink_nl_cmd_trap_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(trap_item, &devlink->trap_list, list) {
if (idx < start) {
@@ -7688,11 +7795,14 @@ static int devlink_nl_cmd_trap_get_dumpit(struct sk_buff *msg,
NLM_F_MULTI);
if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
idx++;
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -7892,8 +8002,12 @@ static int devlink_nl_cmd_trap_group_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(group_item, &devlink->trap_group_list,
list) {
@@ -7908,11 +8022,14 @@ static int devlink_nl_cmd_trap_group_get_dumpit(struct sk_buff *msg,
NLM_F_MULTI);
if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
idx++;
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -8198,8 +8315,12 @@ static int devlink_nl_cmd_trap_policer_get_dumpit(struct sk_buff *msg,
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ if (!devlink_try_get(devlink))
continue;
+
+ if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
+ goto retry;
+
mutex_lock(&devlink->lock);
list_for_each_entry(policer_item, &devlink->trap_policer_list,
list) {
@@ -8214,11 +8335,14 @@ static int devlink_nl_cmd_trap_policer_get_dumpit(struct sk_buff *msg,
NLM_F_MULTI);
if (err) {
mutex_unlock(&devlink->lock);
+ devlink_put(devlink);
goto out;
}
idx++;
}
mutex_unlock(&devlink->lock);
+retry:
+ devlink_put(devlink);
}
out:
mutex_unlock(&devlink_mutex);
@@ -8801,6 +8925,9 @@ struct devlink *devlink_alloc_ns(const struct devlink_ops *ops,
INIT_LIST_HEAD(&devlink->trap_policer_list);
mutex_init(&devlink->lock);
mutex_init(&devlink->reporters_lock);
+ refcount_set(&devlink->refcount, 1);
+ init_completion(&devlink->comp);
+
return devlink;
}
EXPORT_SYMBOL_GPL(devlink_alloc_ns);
@@ -8827,6 +8954,9 @@ EXPORT_SYMBOL_GPL(devlink_register);
*/
void devlink_unregister(struct devlink *devlink)
{
+ devlink_put(devlink);
+ wait_for_completion(&devlink->comp);
+
mutex_lock(&devlink_mutex);
WARN_ON(devlink_reload_supported(devlink->ops) &&
devlink->reload_enabled);
@@ -11374,9 +11504,12 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
*/
mutex_lock(&devlink_mutex);
list_for_each_entry(devlink, &devlink_list, list) {
- if (!net_eq(devlink_net(devlink), net))
+ if (!devlink_try_get(devlink))
continue;
+ if (!net_eq(devlink_net(devlink), net))
+ goto retry;
+
WARN_ON(!devlink_reload_supported(devlink->ops));
err = devlink_reload(devlink, &init_net,
DEVLINK_RELOAD_ACTION_DRIVER_REINIT,
@@ -11384,6 +11517,8 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net)
&actions_performed, NULL);
if (err && err != -EOPNOTSUPP)
pr_warn("Failed to reload devlink instance into init_net\n");
+retry:
+ devlink_put(devlink);
}
mutex_unlock(&devlink_mutex);
}
--
2.31.1
From: Leon Romanovsky <[email protected]>
The devlink pointer always exists after hclge_devlink_init() succeed.
Remove that check together with NULL setting after release and ensure
that devlink_register is last command prior to call to devlink_reload_enable().
Fixes: b741269b2759 ("net: hns3: add support for registering devlink for PF")
Signed-off-by: Leon Romanovsky <[email protected]>
---
.../net/ethernet/hisilicon/hns3/hns3pf/hclge_devlink.c | 8 +-------
.../net/ethernet/hisilicon/hns3/hns3vf/hclgevf_devlink.c | 8 +-------
2 files changed, 2 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_devlink.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_devlink.c
index 448f29aa4e6b..e4aad695abcc 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_devlink.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_devlink.c
@@ -118,6 +118,7 @@ int hclge_devlink_init(struct hclge_dev *hdev)
priv = devlink_priv(devlink);
priv->hdev = hdev;
+ hdev->devlink = devlink;
ret = devlink_register(devlink);
if (ret) {
@@ -126,8 +127,6 @@ int hclge_devlink_init(struct hclge_dev *hdev)
goto out_reg_fail;
}
- hdev->devlink = devlink;
-
devlink_reload_enable(devlink);
return 0;
@@ -141,14 +140,9 @@ void hclge_devlink_uninit(struct hclge_dev *hdev)
{
struct devlink *devlink = hdev->devlink;
- if (!devlink)
- return;
-
devlink_reload_disable(devlink);
devlink_unregister(devlink);
devlink_free(devlink);
-
- hdev->devlink = NULL;
}
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_devlink.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_devlink.c
index 1e6061fb8ed4..f478770299c6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_devlink.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_devlink.c
@@ -120,6 +120,7 @@ int hclgevf_devlink_init(struct hclgevf_dev *hdev)
priv = devlink_priv(devlink);
priv->hdev = hdev;
+ hdev->devlink = devlink;
ret = devlink_register(devlink);
if (ret) {
@@ -128,8 +129,6 @@ int hclgevf_devlink_init(struct hclgevf_dev *hdev)
goto out_reg_fail;
}
- hdev->devlink = devlink;
-
devlink_reload_enable(devlink);
return 0;
@@ -143,14 +142,9 @@ void hclgevf_devlink_uninit(struct hclgevf_dev *hdev)
{
struct devlink *devlink = hdev->devlink;
- if (!devlink)
- return;
-
devlink_reload_disable(devlink);
devlink_unregister(devlink);
devlink_free(devlink);
-
- hdev->devlink = NULL;
}
--
2.31.1
Hello:
This series was applied to netdev/net-next.git (refs/heads/master):
On Sat, 14 Aug 2021 12:57:25 +0300 you wrote:
> From: Leon Romanovsky <[email protected]>
>
> Hi,
>
> Jakub's request to make sure that devlink events are delayed and not
> printed till they fully accessible [1] requires us to implement delayed
> event notification system in the devlink.
>
> [...]
Here is the summary with links:
- [net-next,1/6] devlink: Simplify devlink_pernet_pre_exit call
https://git.kernel.org/netdev/net-next/c/cbf6ab672eb4
- [net-next,2/6] devlink: Remove check of always valid devlink pointer
https://git.kernel.org/netdev/net-next/c/7ca973dc9fe5
- [net-next,3/6] devlink: Count struct devlink consumers
https://git.kernel.org/netdev/net-next/c/437ebfd90a25
- [net-next,4/6] devlink: Use xarray to store devlink instances
https://git.kernel.org/netdev/net-next/c/11a861d767cd
- [net-next,5/6] devlink: Clear whole devlink_flash_notify struct
https://git.kernel.org/netdev/net-next/c/ed43fbac7178
- [net-next,6/6] net: hns3: remove always exist devlink pointer check
https://git.kernel.org/netdev/net-next/c/a1fcb106ae97
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote:
> From: Leon Romanovsky <[email protected]>
>
> The struct devlink itself is protected by internal lock and doesn't
> need global lock during operation. That global lock is used to protect
> addition/removal new devlink instances from the global list in use by
> all devlink consumers in the system.
>
> The future conversion of linked list to be xarray will allow us to
> actually delete that lock, but first we need to count all struct devlink
> users.
Not a problem with this set but to state the obvious the global devlink
lock also protects from concurrent execution of all the ops which don't
take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely know
this but I thought I'd comment on an off chance it helps.
> The reference counting provides us a way to ensure that no new user
> space commands success to grab devlink instance which is going to be
> destroyed makes it is safe to access it without lock.
On Mon, Aug 16, 2021 at 08:47:41AM -0700, Jakub Kicinski wrote:
> On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote:
> > From: Leon Romanovsky <[email protected]>
> >
> > The struct devlink itself is protected by internal lock and doesn't
> > need global lock during operation. That global lock is used to protect
> > addition/removal new devlink instances from the global list in use by
> > all devlink consumers in the system.
> >
> > The future conversion of linked list to be xarray will allow us to
> > actually delete that lock, but first we need to count all struct devlink
> > users.
>
> Not a problem with this set but to state the obvious the global devlink
> lock also protects from concurrent execution of all the ops which don't
> take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely know
> this but I thought I'd comment on an off chance it helps.
The end goal will be something like that:
1. Delete devlink lock
2. Rely on xa_lock() while grabbing devlink instance (past devlink_try_get)
3. Convert devlink->lock to be read/write lock to make sure that we can run
get query in parallel.
4. Open devlink netlink to parallel ops, ".parallel_ops = true".
Thanks
>
> > The reference counting provides us a way to ensure that no new user
> > space commands success to grab devlink instance which is going to be
> > destroyed makes it is safe to access it without lock.
>
On Mon, 16 Aug 2021 18:53:45 +0300 Leon Romanovsky wrote:
> On Mon, Aug 16, 2021 at 08:47:41AM -0700, Jakub Kicinski wrote:
> > On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote:
> > > From: Leon Romanovsky <[email protected]>
> > >
> > > The struct devlink itself is protected by internal lock and doesn't
> > > need global lock during operation. That global lock is used to protect
> > > addition/removal new devlink instances from the global list in use by
> > > all devlink consumers in the system.
> > >
> > > The future conversion of linked list to be xarray will allow us to
> > > actually delete that lock, but first we need to count all struct devlink
> > > users.
> >
> > Not a problem with this set but to state the obvious the global devlink
> > lock also protects from concurrent execution of all the ops which don't
> > take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely know
> > this but I thought I'd comment on an off chance it helps.
>
> The end goal will be something like that:
> 1. Delete devlink lock
> 2. Rely on xa_lock() while grabbing devlink instance (past devlink_try_get)
> 3. Convert devlink->lock to be read/write lock to make sure that we can run
> get query in parallel.
> 4. Open devlink netlink to parallel ops, ".parallel_ops = true".
IIUC that'd mean setting eswitch mode would hold write lock on
the dl instance. What locks does e.g. registering a dl port take
then?
> -----Original Message-----
> From: Leon Romanovsky <[email protected]>
> Sent: Saturday, August 14, 2021 2:57 AM
> To: David S . Miller <[email protected]>; Jakub Kicinski <[email protected]>
> Cc: Leon Romanovsky <[email protected]>; Guangbin Huang
> <[email protected]>; Keller, Jacob E <[email protected]>; Jiri
> Pirko <[email protected]>; [email protected]; [email protected];
> Salil Mehta <[email protected]>; Shannon Nelson
> <[email protected]>; Yisen Zhuang <[email protected]>; Yufeng
> Mo <[email protected]>
> Subject: [PATCH net-next 4/6] devlink: Use xarray to store devlink instances
>
> From: Leon Romanovsky <[email protected]>
>
> We can use xarray instead of linearly organized linked lists for the
> devlink instances. This will let us revise the locking scheme in favour
> of internal xarray locking that protects database.
>
Nice. Seems like an xarray makes quite a bit of sense here vs a linked list, and it's resizable. Plus we can use marks to loop over the registered devlinks. Ok.
The conversions to xa interfaces look correct to me.
Reviewed-by: Jacob Keller <[email protected]>
Thanks,
Jake
> Signed-off-by: Leon Romanovsky <[email protected]>
> ---
> include/net/devlink.h | 2 +-
> net/core/devlink.c | 70 ++++++++++++++++++++++++++++++-------------
> 2 files changed, 50 insertions(+), 22 deletions(-)
>
> diff --git a/include/net/devlink.h b/include/net/devlink.h
> index 4c60d61d92da..154cf0dbca37 100644
> --- a/include/net/devlink.h
> +++ b/include/net/devlink.h
> @@ -32,7 +32,7 @@ struct devlink_dev_stats {
> struct devlink_ops;
>
> struct devlink {
> - struct list_head list;
> + u32 index;
> struct list_head port_list;
> struct list_head rate_list;
> struct list_head sb_list;
> diff --git a/net/core/devlink.c b/net/core/devlink.c
> index 76f459da6e05..d218f57ad8cf 100644
> --- a/net/core/devlink.c
> +++ b/net/core/devlink.c
> @@ -92,7 +92,8 @@ static const struct nla_policy
> devlink_function_nl_policy[DEVLINK_PORT_FUNCTION_
> DEVLINK_PORT_FN_STATE_ACTIVE),
> };
>
> -static LIST_HEAD(devlink_list);
> +static DEFINE_XARRAY_FLAGS(devlinks, XA_FLAGS_ALLOC);
> +#define DEVLINK_REGISTERED XA_MARK_1
>
> /* devlink_mutex
> *
> @@ -123,6 +124,7 @@ static struct devlink *devlink_get_from_attrs(struct net
> *net,
> struct nlattr **attrs)
> {
> struct devlink *devlink;
> + unsigned long index;
> bool found = false;
> char *busname;
> char *devname;
> @@ -135,7 +137,7 @@ static struct devlink *devlink_get_from_attrs(struct net
> *net,
>
> lockdep_assert_held(&devlink_mutex);
>
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (strcmp(devlink->dev->bus->name, busname) == 0 &&
> strcmp(dev_name(devlink->dev), devname) == 0 &&
> net_eq(devlink_net(devlink), net)) {
> @@ -1087,11 +1089,12 @@ static int devlink_nl_cmd_rate_get_dumpit(struct
> sk_buff *msg,
> struct devlink_rate *devlink_rate;
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err = 0;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -1189,11 +1192,12 @@ static int devlink_nl_cmd_get_dumpit(struct sk_buff
> *msg,
> {
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -1251,11 +1255,12 @@ static int devlink_nl_cmd_port_get_dumpit(struct
> sk_buff *msg,
> struct devlink *devlink;
> struct devlink_port *devlink_port;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -1916,11 +1921,12 @@ static int devlink_nl_cmd_sb_get_dumpit(struct
> sk_buff *msg,
> struct devlink *devlink;
> struct devlink_sb *devlink_sb;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -2067,11 +2073,12 @@ static int
> devlink_nl_cmd_sb_pool_get_dumpit(struct sk_buff *msg,
> struct devlink *devlink;
> struct devlink_sb *devlink_sb;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err = 0;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -2287,11 +2294,12 @@ static int
> devlink_nl_cmd_sb_port_pool_get_dumpit(struct sk_buff *msg,
> struct devlink *devlink;
> struct devlink_sb *devlink_sb;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err = 0;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -2535,11 +2543,12 @@ devlink_nl_cmd_sb_tc_pool_bind_get_dumpit(struct
> sk_buff *msg,
> struct devlink *devlink;
> struct devlink_sb *devlink_sb;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err = 0;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -4611,11 +4620,12 @@ static int devlink_nl_cmd_param_get_dumpit(struct
> sk_buff *msg,
> struct devlink_param_item *param_item;
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err = 0;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -4886,11 +4896,12 @@ static int
> devlink_nl_cmd_port_param_get_dumpit(struct sk_buff *msg,
> struct devlink_port *devlink_port;
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err = 0;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -5462,11 +5473,12 @@ static int devlink_nl_cmd_region_get_dumpit(struct
> sk_buff *msg,
> {
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err = 0;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -5995,11 +6007,12 @@ static int devlink_nl_cmd_info_get_dumpit(struct
> sk_buff *msg,
> {
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err = 0;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -7176,11 +7189,12 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct
> sk_buff *msg,
> struct devlink_port *port;
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -7210,7 +7224,7 @@ devlink_nl_cmd_health_reporter_get_dumpit(struct
> sk_buff *msg,
> devlink_put(devlink);
> }
>
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -7771,11 +7785,12 @@ static int devlink_nl_cmd_trap_get_dumpit(struct
> sk_buff *msg,
> struct devlink_trap_item *trap_item;
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -7997,11 +8012,12 @@ static int
> devlink_nl_cmd_trap_group_get_dumpit(struct sk_buff *msg,
> u32 portid = NETLINK_CB(cb->skb).portid;
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -8310,11 +8326,12 @@ static int
> devlink_nl_cmd_trap_policer_get_dumpit(struct sk_buff *msg,
> u32 portid = NETLINK_CB(cb->skb).portid;
> struct devlink *devlink;
> int start = cb->args[0];
> + unsigned long index;
> int idx = 0;
> int err;
>
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> @@ -8899,6 +8916,8 @@ struct devlink *devlink_alloc_ns(const struct
> devlink_ops *ops,
> struct device *dev)
> {
> struct devlink *devlink;
> + static u32 last_id;
> + int ret;
>
> WARN_ON(!ops || !dev);
> if (!devlink_reload_actions_valid(ops))
> @@ -8908,6 +8927,13 @@ struct devlink *devlink_alloc_ns(const struct
> devlink_ops *ops,
> if (!devlink)
> return NULL;
>
> + ret = xa_alloc_cyclic(&devlinks, &devlink->index, devlink, xa_limit_31b,
> + &last_id, GFP_KERNEL);
> + if (ret < 0) {
> + kfree(devlink);
> + return NULL;
> + }
> +
> devlink->dev = dev;
> devlink->ops = ops;
> xa_init_flags(&devlink->snapshot_ids, XA_FLAGS_ALLOC);
> @@ -8940,7 +8966,7 @@ EXPORT_SYMBOL_GPL(devlink_alloc_ns);
> int devlink_register(struct devlink *devlink)
> {
> mutex_lock(&devlink_mutex);
> - list_add_tail(&devlink->list, &devlink_list);
> + xa_set_mark(&devlinks, devlink->index, DEVLINK_REGISTERED);
> devlink_notify(devlink, DEVLINK_CMD_NEW);
> mutex_unlock(&devlink_mutex);
> return 0;
> @@ -8961,7 +8987,7 @@ void devlink_unregister(struct devlink *devlink)
> WARN_ON(devlink_reload_supported(devlink->ops) &&
> devlink->reload_enabled);
> devlink_notify(devlink, DEVLINK_CMD_DEL);
> - list_del(&devlink->list);
> + xa_clear_mark(&devlinks, devlink->index, DEVLINK_REGISTERED);
> mutex_unlock(&devlink_mutex);
> }
> EXPORT_SYMBOL_GPL(devlink_unregister);
> @@ -9023,6 +9049,7 @@ void devlink_free(struct devlink *devlink)
> WARN_ON(!list_empty(&devlink->port_list));
>
> xa_destroy(&devlink->snapshot_ids);
> + xa_erase(&devlinks, devlink->index);
>
> kfree(devlink);
> }
> @@ -11497,13 +11524,14 @@ static void __net_exit
> devlink_pernet_pre_exit(struct net *net)
> {
> struct devlink *devlink;
> u32 actions_performed;
> + unsigned long index;
> int err;
>
> /* In case network namespace is getting destroyed, reload
> * all devlink instances from this namespace into init_net.
> */
> mutex_lock(&devlink_mutex);
> - list_for_each_entry(devlink, &devlink_list, list) {
> + xa_for_each_marked(&devlinks, index, devlink, DEVLINK_REGISTERED) {
> if (!devlink_try_get(devlink))
> continue;
>
> --
> 2.31.1
> -----Original Message-----
> From: Leon Romanovsky <[email protected]>
> Sent: Saturday, August 14, 2021 2:58 AM
> To: David S . Miller <[email protected]>; Jakub Kicinski <[email protected]>
> Cc: Leon Romanovsky <[email protected]>; Guangbin Huang
> <[email protected]>; Keller, Jacob E <[email protected]>; Jiri
> Pirko <[email protected]>; [email protected]; [email protected];
> Salil Mehta <[email protected]>; Shannon Nelson
> <[email protected]>; Yisen Zhuang <[email protected]>; Yufeng
> Mo <[email protected]>
> Subject: [PATCH net-next 5/6] devlink: Clear whole devlink_flash_notify struct
>
> From: Leon Romanovsky <[email protected]>
>
> The { 0 } doesn't clear all fields in the struct, but tells to the
> compiler to set all fields to zero and doesn't touch any sub-fields
> if they exists.
>
> The {} is an empty initialiser that instructs to fully initialize whole
> struct including sub-fields, which is error-prone for future
> devlink_flash_notify extensions.
>
> Fixes: 6700acc5f1fe ("devlink: collect flash notify params into a struct")
> Signed-off-by: Leon Romanovsky <[email protected]>
Yep, we should have used {} before. Are there any other misses where I used { 0 }.... Nope, I just double checked. Ok great!
Reviewed-by: Jacob Keller <[email protected]>
> ---
> net/core/devlink.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/devlink.c b/net/core/devlink.c
> index d218f57ad8cf..a856ae401ea5 100644
> --- a/net/core/devlink.c
> +++ b/net/core/devlink.c
> @@ -4169,7 +4169,7 @@ static void __devlink_flash_update_notify(struct
> devlink *devlink,
>
> static void devlink_flash_update_begin_notify(struct devlink *devlink)
> {
> - struct devlink_flash_notify params = { 0 };
> + struct devlink_flash_notify params = {};
>
> __devlink_flash_update_notify(devlink,
> DEVLINK_CMD_FLASH_UPDATE,
> @@ -4178,7 +4178,7 @@ static void devlink_flash_update_begin_notify(struct
> devlink *devlink)
>
> static void devlink_flash_update_end_notify(struct devlink *devlink)
> {
> - struct devlink_flash_notify params = { 0 };
> + struct devlink_flash_notify params = {};
>
> __devlink_flash_update_notify(devlink,
> DEVLINK_CMD_FLASH_UPDATE_END,
> --
> 2.31.1
> -----Original Message-----
> From: Jakub Kicinski <[email protected]>
> Sent: Monday, August 16, 2021 9:07 AM
> To: Leon Romanovsky <[email protected]>
> Cc: David S . Miller <[email protected]>; Guangbin Huang
> <[email protected]>; Keller, Jacob E <[email protected]>; Jiri
> Pirko <[email protected]>; [email protected]; [email protected];
> Salil Mehta <[email protected]>; Shannon Nelson
> <[email protected]>; Yisen Zhuang <[email protected]>; Yufeng
> Mo <[email protected]>
> Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers
>
> On Mon, 16 Aug 2021 18:53:45 +0300 Leon Romanovsky wrote:
> > On Mon, Aug 16, 2021 at 08:47:41AM -0700, Jakub Kicinski wrote:
> > > On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote:
> > > > From: Leon Romanovsky <[email protected]>
> > > >
> > > > The struct devlink itself is protected by internal lock and doesn't
> > > > need global lock during operation. That global lock is used to protect
> > > > addition/removal new devlink instances from the global list in use by
> > > > all devlink consumers in the system.
> > > >
> > > > The future conversion of linked list to be xarray will allow us to
> > > > actually delete that lock, but first we need to count all struct devlink
> > > > users.
> > >
> > > Not a problem with this set but to state the obvious the global devlink
> > > lock also protects from concurrent execution of all the ops which don't
> > > take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely know
> > > this but I thought I'd comment on an off chance it helps.
> >
> > The end goal will be something like that:
> > 1. Delete devlink lock
> > 2. Rely on xa_lock() while grabbing devlink instance (past devlink_try_get)
> > 3. Convert devlink->lock to be read/write lock to make sure that we can run
> > get query in parallel.
> > 4. Open devlink netlink to parallel ops, ".parallel_ops = true".
>
> IIUC that'd mean setting eswitch mode would hold write lock on
> the dl instance. What locks does e.g. registering a dl port take
> then?
Also that I think we have some cases where we want to allow the driver to allocate new devlink objects in response to adding a port, but still want to block other global operations from running?
On Mon, Aug 16, 2021 at 09:07:00AM -0700, Jakub Kicinski wrote:
> On Mon, 16 Aug 2021 18:53:45 +0300 Leon Romanovsky wrote:
> > On Mon, Aug 16, 2021 at 08:47:41AM -0700, Jakub Kicinski wrote:
> > > On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote:
> > > > From: Leon Romanovsky <[email protected]>
> > > >
> > > > The struct devlink itself is protected by internal lock and doesn't
> > > > need global lock during operation. That global lock is used to protect
> > > > addition/removal new devlink instances from the global list in use by
> > > > all devlink consumers in the system.
> > > >
> > > > The future conversion of linked list to be xarray will allow us to
> > > > actually delete that lock, but first we need to count all struct devlink
> > > > users.
> > >
> > > Not a problem with this set but to state the obvious the global devlink
> > > lock also protects from concurrent execution of all the ops which don't
> > > take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely know
> > > this but I thought I'd comment on an off chance it helps.
> >
> > The end goal will be something like that:
> > 1. Delete devlink lock
> > 2. Rely on xa_lock() while grabbing devlink instance (past devlink_try_get)
> > 3. Convert devlink->lock to be read/write lock to make sure that we can run
> > get query in parallel.
> > 4. Open devlink netlink to parallel ops, ".parallel_ops = true".
>
> IIUC that'd mean setting eswitch mode would hold write lock on
> the dl instance. What locks does e.g. registering a dl port take
> then?
write lock, because we are adding port to devlink->port_list.
9099 int devlink_port_register(struct devlink *devlink,
9100 struct devlink_port *devlink_port,
9101 unsigned int port_index)
9102 {
...
9115 list_add_tail(&devlink_port->list, &devlink->port_list);
On Mon, Aug 16, 2021 at 09:32:17PM +0000, Keller, Jacob E wrote:
>
>
> > -----Original Message-----
> > From: Jakub Kicinski <[email protected]>
> > Sent: Monday, August 16, 2021 9:07 AM
> > To: Leon Romanovsky <[email protected]>
> > Cc: David S . Miller <[email protected]>; Guangbin Huang
> > <[email protected]>; Keller, Jacob E <[email protected]>; Jiri
> > Pirko <[email protected]>; [email protected]; [email protected];
> > Salil Mehta <[email protected]>; Shannon Nelson
> > <[email protected]>; Yisen Zhuang <[email protected]>; Yufeng
> > Mo <[email protected]>
> > Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers
> >
> > On Mon, 16 Aug 2021 18:53:45 +0300 Leon Romanovsky wrote:
> > > On Mon, Aug 16, 2021 at 08:47:41AM -0700, Jakub Kicinski wrote:
> > > > On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote:
> > > > > From: Leon Romanovsky <[email protected]>
> > > > >
> > > > > The struct devlink itself is protected by internal lock and doesn't
> > > > > need global lock during operation. That global lock is used to protect
> > > > > addition/removal new devlink instances from the global list in use by
> > > > > all devlink consumers in the system.
> > > > >
> > > > > The future conversion of linked list to be xarray will allow us to
> > > > > actually delete that lock, but first we need to count all struct devlink
> > > > > users.
> > > >
> > > > Not a problem with this set but to state the obvious the global devlink
> > > > lock also protects from concurrent execution of all the ops which don't
> > > > take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely know
> > > > this but I thought I'd comment on an off chance it helps.
> > >
> > > The end goal will be something like that:
> > > 1. Delete devlink lock
> > > 2. Rely on xa_lock() while grabbing devlink instance (past devlink_try_get)
> > > 3. Convert devlink->lock to be read/write lock to make sure that we can run
> > > get query in parallel.
> > > 4. Open devlink netlink to parallel ops, ".parallel_ops = true".
> >
> > IIUC that'd mean setting eswitch mode would hold write lock on
> > the dl instance. What locks does e.g. registering a dl port take
> > then?
>
> Also that I think we have some cases where we want to allow the driver to allocate new devlink objects in response to adding a port, but still want to block other global operations from running?
I don't see the flow where operations on devlink_A should block devlink_B.
Only in such flows we will need global lock like we have now - devlink->lock.
In all other flows, write lock of devlink instance will protect from
parallel execution.
Thanks
> -----Original Message-----
> From: Leon Romanovsky <[email protected]>
> Sent: Wednesday, August 18, 2021 1:12 AM
> To: Keller, Jacob E <[email protected]>
> Cc: Jakub Kicinski <[email protected]>; David S . Miller <[email protected]>;
> Guangbin Huang <[email protected]>; Jiri Pirko <[email protected]>;
> [email protected]; [email protected]; Salil Mehta
> <[email protected]>; Shannon Nelson <[email protected]>; Yisen
> Zhuang <[email protected]>; Yufeng Mo <[email protected]>
> Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers
>
> On Mon, Aug 16, 2021 at 09:32:17PM +0000, Keller, Jacob E wrote:
> >
> >
> > > -----Original Message-----
> > > From: Jakub Kicinski <[email protected]>
> > > Sent: Monday, August 16, 2021 9:07 AM
> > > To: Leon Romanovsky <[email protected]>
> > > Cc: David S . Miller <[email protected]>; Guangbin Huang
> > > <[email protected]>; Keller, Jacob E <[email protected]>;
> Jiri
> > > Pirko <[email protected]>; [email protected];
> [email protected];
> > > Salil Mehta <[email protected]>; Shannon Nelson
> > > <[email protected]>; Yisen Zhuang <[email protected]>; Yufeng
> > > Mo <[email protected]>
> > > Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers
> > >
> > > On Mon, 16 Aug 2021 18:53:45 +0300 Leon Romanovsky wrote:
> > > > On Mon, Aug 16, 2021 at 08:47:41AM -0700, Jakub Kicinski wrote:
> > > > > On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote:
> > > > > > From: Leon Romanovsky <[email protected]>
> > > > > >
> > > > > > The struct devlink itself is protected by internal lock and doesn't
> > > > > > need global lock during operation. That global lock is used to protect
> > > > > > addition/removal new devlink instances from the global list in use by
> > > > > > all devlink consumers in the system.
> > > > > >
> > > > > > The future conversion of linked list to be xarray will allow us to
> > > > > > actually delete that lock, but first we need to count all struct devlink
> > > > > > users.
> > > > >
> > > > > Not a problem with this set but to state the obvious the global devlink
> > > > > lock also protects from concurrent execution of all the ops which don't
> > > > > take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely
> know
> > > > > this but I thought I'd comment on an off chance it helps.
> > > >
> > > > The end goal will be something like that:
> > > > 1. Delete devlink lock
> > > > 2. Rely on xa_lock() while grabbing devlink instance (past devlink_try_get)
> > > > 3. Convert devlink->lock to be read/write lock to make sure that we can run
> > > > get query in parallel.
> > > > 4. Open devlink netlink to parallel ops, ".parallel_ops = true".
> > >
> > > IIUC that'd mean setting eswitch mode would hold write lock on
> > > the dl instance. What locks does e.g. registering a dl port take
> > > then?
> >
> > Also that I think we have some cases where we want to allow the driver to
> allocate new devlink objects in response to adding a port, but still want to block
> other global operations from running?
>
> I don't see the flow where operations on devlink_A should block devlink_B.
> Only in such flows we will need global lock like we have now - devlink->lock.
> In all other flows, write lock of devlink instance will protect from
> parallel execution.
>
> Thanks
But how do we handle what is essentially recursion?
If we add a port on the devlink A:
userspace sends PORT_ADD for devlink A
driver responds by creating a port
adding a port causes driver to add a region, or other devlink object
In the current design, if I understand correctly, we hold the global lock but *not* the instance lock. We can't hold the instance lock while adding port without breaking a bunch of drivers that add many devlink objects in response to port creation.. because they'll deadlock when going to add the sub objects.
But if we don't hold the global lock, then in theory another userspace program could attempt to do something inbetween PORT_ADD starting and finishing which might not be desirable. (Remember, we had to drop the instance lock otherwise drivers get stuck when trying to add many subobjects)
Thanks,
Jake
On Wed, Aug 18, 2021 at 05:50:11PM +0000, Keller, Jacob E wrote:
>
>
> > -----Original Message-----
> > From: Leon Romanovsky <[email protected]>
> > Sent: Wednesday, August 18, 2021 1:12 AM
> > To: Keller, Jacob E <[email protected]>
> > Cc: Jakub Kicinski <[email protected]>; David S . Miller <[email protected]>;
> > Guangbin Huang <[email protected]>; Jiri Pirko <[email protected]>;
> > [email protected]; [email protected]; Salil Mehta
> > <[email protected]>; Shannon Nelson <[email protected]>; Yisen
> > Zhuang <[email protected]>; Yufeng Mo <[email protected]>
> > Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers
> >
> > On Mon, Aug 16, 2021 at 09:32:17PM +0000, Keller, Jacob E wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Jakub Kicinski <[email protected]>
> > > > Sent: Monday, August 16, 2021 9:07 AM
> > > > To: Leon Romanovsky <[email protected]>
> > > > Cc: David S . Miller <[email protected]>; Guangbin Huang
> > > > <[email protected]>; Keller, Jacob E <[email protected]>;
> > Jiri
> > > > Pirko <[email protected]>; [email protected];
> > [email protected];
> > > > Salil Mehta <[email protected]>; Shannon Nelson
> > > > <[email protected]>; Yisen Zhuang <[email protected]>; Yufeng
> > > > Mo <[email protected]>
> > > > Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers
> > > >
> > > > On Mon, 16 Aug 2021 18:53:45 +0300 Leon Romanovsky wrote:
> > > > > On Mon, Aug 16, 2021 at 08:47:41AM -0700, Jakub Kicinski wrote:
> > > > > > On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote:
> > > > > > > From: Leon Romanovsky <[email protected]>
> > > > > > >
> > > > > > > The struct devlink itself is protected by internal lock and doesn't
> > > > > > > need global lock during operation. That global lock is used to protect
> > > > > > > addition/removal new devlink instances from the global list in use by
> > > > > > > all devlink consumers in the system.
> > > > > > >
> > > > > > > The future conversion of linked list to be xarray will allow us to
> > > > > > > actually delete that lock, but first we need to count all struct devlink
> > > > > > > users.
> > > > > >
> > > > > > Not a problem with this set but to state the obvious the global devlink
> > > > > > lock also protects from concurrent execution of all the ops which don't
> > > > > > take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely
> > know
> > > > > > this but I thought I'd comment on an off chance it helps.
> > > > >
> > > > > The end goal will be something like that:
> > > > > 1. Delete devlink lock
> > > > > 2. Rely on xa_lock() while grabbing devlink instance (past devlink_try_get)
> > > > > 3. Convert devlink->lock to be read/write lock to make sure that we can run
> > > > > get query in parallel.
> > > > > 4. Open devlink netlink to parallel ops, ".parallel_ops = true".
> > > >
> > > > IIUC that'd mean setting eswitch mode would hold write lock on
> > > > the dl instance. What locks does e.g. registering a dl port take
> > > > then?
> > >
> > > Also that I think we have some cases where we want to allow the driver to
> > allocate new devlink objects in response to adding a port, but still want to block
> > other global operations from running?
> >
> > I don't see the flow where operations on devlink_A should block devlink_B.
> > Only in such flows we will need global lock like we have now - devlink->lock.
> > In all other flows, write lock of devlink instance will protect from
> > parallel execution.
> >
> > Thanks
>
>
> But how do we handle what is essentially recursion?
Let's wait till implementation, I promise it will be covered :).
>
> If we add a port on the devlink A:
>
> userspace sends PORT_ADD for devlink A
> driver responds by creating a port
> adding a port causes driver to add a region, or other devlink object
>
> In the current design, if I understand correctly, we hold the global lock but *not* the instance lock. We can't hold the instance lock while adding port without breaking a bunch of drivers that add many devlink objects in response to port creation.. because they'll deadlock when going to add the sub objects.
>
> But if we don't hold the global lock, then in theory another userspace program could attempt to do something inbetween PORT_ADD starting and finishing which might not be desirable. (Remember, we had to drop the instance lock otherwise drivers get stuck when trying to add many subobjects)
You just surfaced my main issue with the current devlink
implementation - the purpose of devlink_lock. Over the years devlink
code lost clear separation between user space flows and kernel flows.
Thanks
>
> Thanks,
> Jake
> -----Original Message-----
> From: Leon Romanovsky <[email protected]>
> Sent: Friday, August 20, 2021 6:07 AM
> To: Keller, Jacob E <[email protected]>
> Cc: Jakub Kicinski <[email protected]>; David S . Miller <[email protected]>;
> Guangbin Huang <[email protected]>; Jiri Pirko <[email protected]>;
> [email protected]; [email protected]; Salil Mehta
> <[email protected]>; Shannon Nelson <[email protected]>; Yisen
> Zhuang <[email protected]>; Yufeng Mo <[email protected]>
> Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers
>
> On Wed, Aug 18, 2021 at 05:50:11PM +0000, Keller, Jacob E wrote:
> >
> >
> > > -----Original Message-----
> > > From: Leon Romanovsky <[email protected]>
> > > Sent: Wednesday, August 18, 2021 1:12 AM
> > > To: Keller, Jacob E <[email protected]>
> > > Cc: Jakub Kicinski <[email protected]>; David S . Miller
> <[email protected]>;
> > > Guangbin Huang <[email protected]>; Jiri Pirko
> <[email protected]>;
> > > [email protected]; [email protected]; Salil Mehta
> > > <[email protected]>; Shannon Nelson <[email protected]>; Yisen
> > > Zhuang <[email protected]>; Yufeng Mo <[email protected]>
> > > Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers
> > >
> > > On Mon, Aug 16, 2021 at 09:32:17PM +0000, Keller, Jacob E wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Jakub Kicinski <[email protected]>
> > > > > Sent: Monday, August 16, 2021 9:07 AM
> > > > > To: Leon Romanovsky <[email protected]>
> > > > > Cc: David S . Miller <[email protected]>; Guangbin Huang
> > > > > <[email protected]>; Keller, Jacob E
> <[email protected]>;
> > > Jiri
> > > > > Pirko <[email protected]>; [email protected];
> > > [email protected];
> > > > > Salil Mehta <[email protected]>; Shannon Nelson
> > > > > <[email protected]>; Yisen Zhuang <[email protected]>;
> Yufeng
> > > > > Mo <[email protected]>
> > > > > Subject: Re: [PATCH net-next 3/6] devlink: Count struct devlink consumers
> > > > >
> > > > > On Mon, 16 Aug 2021 18:53:45 +0300 Leon Romanovsky wrote:
> > > > > > On Mon, Aug 16, 2021 at 08:47:41AM -0700, Jakub Kicinski wrote:
> > > > > > > On Sat, 14 Aug 2021 12:57:28 +0300 Leon Romanovsky wrote:
> > > > > > > > From: Leon Romanovsky <[email protected]>
> > > > > > > >
> > > > > > > > The struct devlink itself is protected by internal lock and doesn't
> > > > > > > > need global lock during operation. That global lock is used to protect
> > > > > > > > addition/removal new devlink instances from the global list in use by
> > > > > > > > all devlink consumers in the system.
> > > > > > > >
> > > > > > > > The future conversion of linked list to be xarray will allow us to
> > > > > > > > actually delete that lock, but first we need to count all struct devlink
> > > > > > > > users.
> > > > > > >
> > > > > > > Not a problem with this set but to state the obvious the global devlink
> > > > > > > lock also protects from concurrent execution of all the ops which don't
> > > > > > > take the instance lock (DEVLINK_NL_FLAG_NO_LOCK). You most likely
> > > know
> > > > > > > this but I thought I'd comment on an off chance it helps.
> > > > > >
> > > > > > The end goal will be something like that:
> > > > > > 1. Delete devlink lock
> > > > > > 2. Rely on xa_lock() while grabbing devlink instance (past
> devlink_try_get)
> > > > > > 3. Convert devlink->lock to be read/write lock to make sure that we can
> run
> > > > > > get query in parallel.
> > > > > > 4. Open devlink netlink to parallel ops, ".parallel_ops = true".
> > > > >
> > > > > IIUC that'd mean setting eswitch mode would hold write lock on
> > > > > the dl instance. What locks does e.g. registering a dl port take
> > > > > then?
> > > >
> > > > Also that I think we have some cases where we want to allow the driver to
> > > allocate new devlink objects in response to adding a port, but still want to
> block
> > > other global operations from running?
> > >
> > > I don't see the flow where operations on devlink_A should block devlink_B.
> > > Only in such flows we will need global lock like we have now - devlink->lock.
> > > In all other flows, write lock of devlink instance will protect from
> > > parallel execution.
> > >
> > > Thanks
> >
> >
> > But how do we handle what is essentially recursion?
>
> Let's wait till implementation, I promise it will be covered :).
>
Sure. It's certainly easier to talk about a proposed implementation once we have it.
> >
> > If we add a port on the devlink A:
> >
> > userspace sends PORT_ADD for devlink A
> > driver responds by creating a port
> > adding a port causes driver to add a region, or other devlink object
> >
> > In the current design, if I understand correctly, we hold the global lock but
> *not* the instance lock. We can't hold the instance lock while adding port
> without breaking a bunch of drivers that add many devlink objects in response to
> port creation.. because they'll deadlock when going to add the sub objects.
> >
> > But if we don't hold the global lock, then in theory another userspace program
> could attempt to do something inbetween PORT_ADD starting and finishing
> which might not be desirable. (Remember, we had to drop the instance lock
> otherwise drivers get stuck when trying to add many subobjects)
>
> You just surfaced my main issue with the current devlink
> implementation - the purpose of devlink_lock. Over the years devlink
> code lost clear separation between user space flows and kernel flows.
>
> Thanks
>
Yep. It's definitely complex.
> >
> > Thanks,
> > Jake