This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
1.
The problem is that, recently we have run into a precedent when
a user intended to give a second argument to kfree_rcu() API but
forgot to do it in a code so a call became as a single argument
of kfree_rcu() API.
2.
Such mistyping can lead to hidden bags where sleeping is forbidden.
3.
_mightsleep() prefix gives much more information for which contexts
it can be used for.
Uladzislau Rezki (Sony) (13):
rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
RDMA/rxe: Rename kfree_rcu() to kfree_rcu_mightsleep()
net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
ipvs: Rename kfree_rcu() to kfree_rcu_mightsleep()
rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
doc: Update whatisRCU.rst
rcu/kvfree: Eliminate k[v]free_rcu() single argument macro
Documentation/RCU/whatisRCU.rst | 6 ++--
drivers/block/drbd/drbd_nl.c | 6 ++--
drivers/block/drbd/drbd_receiver.c | 4 +--
drivers/block/drbd/drbd_state.c | 2 +-
drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
drivers/misc/vmw_vmci/vmci_context.c | 2 +-
drivers/misc/vmw_vmci/vmci_event.c | 2 +-
.../mellanox/mlx5/core/en/tc/int_port.c | 2 +-
.../mellanox/mlx5/core/en_accel/macsec.c | 4 +--
fs/ext4/super.c | 2 +-
include/linux/rcupdate.h | 28 ++++++-------------
kernel/rcu/rcuscale.c | 2 +-
kernel/trace/trace_osnoise.c | 2 +-
kernel/trace/trace_probe.c | 2 +-
lib/test_vmalloc.c | 2 +-
net/core/sysctl_net_core.c | 4 +--
net/netfilter/ipvs/ip_vs_est.c | 2 +-
17 files changed, 32 insertions(+), 42 deletions(-)
--
2.30.2
These two macroses will replace single-argument k[v]free_rcu() ones.
By adding an extra _mightsleep prefix we can avoid of situations when
someone intended to give a second argument but forgot to do it in a
code where sleeping is illegal.
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
include/linux/rcupdate.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 822ff7b4bb1e..094321c17e48 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -1013,6 +1013,9 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
#define kvfree_rcu(...) KVFREE_GET_MACRO(__VA_ARGS__, \
kvfree_rcu_arg_2, kvfree_rcu_arg_1)(__VA_ARGS__)
+#define kvfree_rcu_mightsleep(ptr) kvfree_rcu_arg_1(ptr)
+#define kfree_rcu_mightsleep(ptr) kvfree_rcu_mightsleep(ptr)
+
#define KVFREE_GET_MACRO(_1, _2, NAME, ...) NAME
#define kvfree_rcu_arg_2(ptr, rhf) \
do { \
--
2.30.2
The kfree_rcu()'s single argument name is deprecated therefore
rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
underline that it is for sleepable contexts.
Cc: Theodore Ts'o <[email protected]>
Cc: Lukas Czerner <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
fs/ext4/super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 260c1b3e3ef2..87aa23d047c9 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2500,7 +2500,7 @@ static void ext4_apply_quota_options(struct fs_context *fc,
qname = rcu_replace_pointer(sbi->s_qf_names[i], qname,
lockdep_is_held(&sb->s_umount));
if (qname)
- kfree_rcu(qname);
+ kfree_rcu_mightsleep(qname);
}
}
--
2.30.2
The kvfree_rcu()'s single argument name is deprecated therefore
rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
underline that it is for sleepable contexts.
Cc: Eric Dumazet <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
net/core/sysctl_net_core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 5b1ce656baa1..a28562d4e468 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -101,7 +101,7 @@ static int rps_sock_flow_sysctl(struct ctl_table *table, int write,
if (orig_sock_table) {
static_branch_dec(&rps_needed);
static_branch_dec(&rfs_needed);
- kvfree_rcu(orig_sock_table);
+ kvfree_rcu_mightsleep(orig_sock_table);
}
}
}
@@ -139,7 +139,7 @@ static int flow_limit_cpu_sysctl(struct ctl_table *table, int write,
lockdep_is_held(&flow_limit_update_mutex));
if (cur && !cpumask_test_cpu(i, mask)) {
RCU_INIT_POINTER(sd->flow_limit, NULL);
- kfree_rcu(cur);
+ kfree_rcu_mightsleep(cur);
} else if (!cur && cpumask_test_cpu(i, mask)) {
cur = kzalloc_node(len, GFP_KERNEL,
cpu_to_node(i));
--
2.30.2
The kfree_rcu()'s single argument name is deprecated therefore
rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
underline that it is for sleepable contexts.
Please check the RXE driver in a way that a single argument can
be used. Briefly looking at it and rcu_head should be embed to
free an obj over RCU-core. The context might be atomic.
Cc: Bob Pearson <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
index f50620f5a0a1..e2fa061f19b3 100644
--- a/drivers/infiniband/sw/rxe/rxe_pool.c
+++ b/drivers/infiniband/sw/rxe/rxe_pool.c
@@ -276,7 +276,7 @@ int __rxe_cleanup(struct rxe_pool_elem *elem, bool sleepable)
pool->cleanup(elem);
if (pool->type == RXE_TYPE_MR)
- kfree_rcu(elem->obj);
+ kfree_rcu_mightsleep(elem->obj);
atomic_dec(&pool->num_elem);
--
2.30.2
The kvfree_rcu()'s single argument name is deprecated therefore
rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
underline that it is for sleepable contexts.
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
lib/test_vmalloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c
index f90d2c27675b..6d8c5c0afd53 100644
--- a/lib/test_vmalloc.c
+++ b/lib/test_vmalloc.c
@@ -328,7 +328,7 @@ kvfree_rcu_1_arg_vmalloc_test(void)
return -1;
p->array[0] = 'a';
- kvfree_rcu(p);
+ kvfree_rcu_mightsleep(p);
}
return 0;
--
2.30.2
The kvfree_rcu()'s single argument name is deprecated therefore
rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
underline that it is for sleepable contexts.
Cc: Jens Axboe <[email protected]>
Cc: Philipp Reisner <[email protected]>
Cc: Lars Ellenberg <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
drivers/block/drbd/drbd_nl.c | 6 +++---
drivers/block/drbd/drbd_receiver.c | 4 ++--
drivers/block/drbd/drbd_state.c | 2 +-
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 60757ac31701..f49f2a5282e1 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1615,7 +1615,7 @@ int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info)
drbd_send_sync_param(peer_device);
}
- kvfree_rcu(old_disk_conf);
+ kvfree_rcu_mightsleep(old_disk_conf);
kfree(old_plan);
mod_timer(&device->request_timer, jiffies + HZ);
goto success;
@@ -2446,7 +2446,7 @@ int drbd_adm_net_opts(struct sk_buff *skb, struct genl_info *info)
mutex_unlock(&connection->resource->conf_update);
mutex_unlock(&connection->data.mutex);
- kvfree_rcu(old_net_conf);
+ kvfree_rcu_mightsleep(old_net_conf);
if (connection->cstate >= C_WF_REPORT_PARAMS) {
struct drbd_peer_device *peer_device;
@@ -2860,7 +2860,7 @@ int drbd_adm_resize(struct sk_buff *skb, struct genl_info *info)
new_disk_conf->disk_size = (sector_t)rs.resize_size;
rcu_assign_pointer(device->ldev->disk_conf, new_disk_conf);
mutex_unlock(&device->resource->conf_update);
- kvfree_rcu(old_disk_conf);
+ kvfree_rcu_mightsleep(old_disk_conf);
new_disk_conf = NULL;
}
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 757f4692b5bd..e197b2a465d2 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -3759,7 +3759,7 @@ static int receive_protocol(struct drbd_connection *connection, struct packet_in
drbd_info(connection, "peer data-integrity-alg: %s\n",
integrity_alg[0] ? integrity_alg : "(none)");
- kvfree_rcu(old_net_conf);
+ kvfree_rcu_mightsleep(old_net_conf);
return 0;
disconnect_rcu_unlock:
@@ -4127,7 +4127,7 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info
rcu_assign_pointer(device->ldev->disk_conf, new_disk_conf);
mutex_unlock(&connection->resource->conf_update);
- kvfree_rcu(old_disk_conf);
+ kvfree_rcu_mightsleep(old_disk_conf);
drbd_info(device, "Peer sets u_size to %lu sectors (old: %lu)\n",
(unsigned long)p_usize, (unsigned long)my_usize);
diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index 75d13ea0024f..2aeea295fa28 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -2071,7 +2071,7 @@ static int w_after_conn_state_ch(struct drbd_work *w, int unused)
conn_free_crypto(connection);
mutex_unlock(&connection->resource->conf_update);
- kvfree_rcu(old_conf);
+ kvfree_rcu_mightsleep(old_conf);
}
if (ns_max.susp_fen) {
--
2.30.2
The kvfree_rcu()'s single argument name is deprecated therefore
rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
underline that it is for sleepable contexts.
Cc: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
kernel/trace/trace_osnoise.c | 2 +-
kernel/trace/trace_probe.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 94c1b5eb1dc0..67392d872b67 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -160,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr)
if (!found)
return;
- kvfree_rcu(inst);
+ kvfree_rcu_mightsleep(inst);
}
/*
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 01ebabbbe8c9..32a7741dc731 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -1170,7 +1170,7 @@ int trace_probe_remove_file(struct trace_probe *tp,
return -ENOENT;
list_del_rcu(&link->list);
- kvfree_rcu(link);
+ kvfree_rcu_mightsleep(link);
if (list_empty(&tp->event->files))
trace_probe_clear_flag(tp, TP_FLAG_TRACE);
--
2.30.2
The kvfree_rcu()'s single argument name is deprecated therefore
rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
underline that it is for sleepable contexts.
Cc: Bryan Tan <[email protected]>
Cc: Vishnu Dasa <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
drivers/misc/vmw_vmci/vmci_context.c | 2 +-
drivers/misc/vmw_vmci/vmci_event.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/vmw_vmci/vmci_context.c b/drivers/misc/vmw_vmci/vmci_context.c
index 172696abce31..f22b44827e92 100644
--- a/drivers/misc/vmw_vmci/vmci_context.c
+++ b/drivers/misc/vmw_vmci/vmci_context.c
@@ -687,7 +687,7 @@ int vmci_ctx_remove_notification(u32 context_id, u32 remote_cid)
spin_unlock(&context->lock);
if (notifier)
- kvfree_rcu(notifier);
+ kvfree_rcu_mightsleep(notifier);
vmci_ctx_put(context);
diff --git a/drivers/misc/vmw_vmci/vmci_event.c b/drivers/misc/vmw_vmci/vmci_event.c
index 2100297c94ad..5d7ac07623c2 100644
--- a/drivers/misc/vmw_vmci/vmci_event.c
+++ b/drivers/misc/vmw_vmci/vmci_event.c
@@ -209,7 +209,7 @@ int vmci_event_unsubscribe(u32 sub_id)
if (!s)
return VMCI_ERROR_NOT_FOUND;
- kvfree_rcu(s);
+ kvfree_rcu_mightsleep(s);
return VMCI_SUCCESS;
}
--
2.30.2
On Wed, Feb 01, 2023 at 04:08:06PM +0100, Uladzislau Rezki (Sony) wrote:
> This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
> it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
>
> 1.
> The problem is that, recently we have run into a precedent when
> a user intended to give a second argument to kfree_rcu() API but
> forgot to do it in a code so a call became as a single argument
> of kfree_rcu() API.
>
> 2.
> Such mistyping can lead to hidden bags where sleeping is forbidden.
>
> 3.
> _mightsleep() prefix gives much more information for which contexts
> it can be used for.
Thank you!!!
I have queued these (aside from 10/13, which is being replaced by a
patch from Julian Anastasov) for further testing and review. And with
the usual wordsmithing.
If testing goes well, I might try to get 1/13 into the next merge window,
which would simplify things if maintainers want to take their patches
separately.
Thanx, Paul
> From: Uladzislau Rezki (Sony) <[email protected]>
> Sent: Wednesday, February 1, 2023 11:08 PM
> To: LKML <[email protected]>; RCU <[email protected]>; Paul E .
> McKenney <[email protected]>
> Cc: Uladzislau Rezki <[email protected]>; Oleksiy Avramchenko
> <[email protected]>; Jens Axboe <[email protected]>; Philipp
> Reisner <[email protected]>; Bryan Tan <[email protected]>;
> Steven Rostedt <[email protected]>; Eric Dumazet
> <[email protected]>; Bob Pearson <[email protected]>; Ariel
> Levkovich <[email protected]>; Theodore Ts'o <[email protected]>; Julian
> Anastasov <[email protected]>
> Subject: [PATCH 01/13] rcu/kvfree: Add kvfree_rcu_mightsleep() and
> kfree_rcu_mightsleep()
>
> These two macroses will replace single-argument k[v]free_rcu() ones.
> By adding an extra _mightsleep prefix we can avoid of situations when someone
s/prefix/suffix
> intended to give a second argument but forgot to do it in a code where sleeping
> is illegal.
>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> ...
On Thu, Feb 02, 2023 at 07:54:31AM +0000, Zhuo, Qiuxu wrote:
> > From: Uladzislau Rezki (Sony) <[email protected]>
> > Sent: Wednesday, February 1, 2023 11:08 PM
> > To: LKML <[email protected]>; RCU <[email protected]>; Paul E .
> > McKenney <[email protected]>
> > Cc: Uladzislau Rezki <[email protected]>; Oleksiy Avramchenko
> > <[email protected]>; Jens Axboe <[email protected]>; Philipp
> > Reisner <[email protected]>; Bryan Tan <[email protected]>;
> > Steven Rostedt <[email protected]>; Eric Dumazet
> > <[email protected]>; Bob Pearson <[email protected]>; Ariel
> > Levkovich <[email protected]>; Theodore Ts'o <[email protected]>; Julian
> > Anastasov <[email protected]>
> > Subject: [PATCH 01/13] rcu/kvfree: Add kvfree_rcu_mightsleep() and
> > kfree_rcu_mightsleep()
> >
> > These two macroses will replace single-argument k[v]free_rcu() ones.
> > By adding an extra _mightsleep prefix we can avoid of situations when someone
>
> s/prefix/suffix
Good eyes, thank you!
Please see below for the version currently queued in the -rcu tree.
Thanx, Paul
> > intended to give a second argument but forgot to do it in a code where sleeping
> > is illegal.
> >
> > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> > ...
------------------------------------------------------------------------
commit d6b54a44de23c83944d042b6f6fd9d19ccd3f1b8
Author: Uladzislau Rezki (Sony) <[email protected]>
Date: Wed Feb 1 16:08:07 2023 +0100
rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
The kvfree_rcu() and kfree_rcu() APIs are hazardous in that if you forget
the second argument, it works, but might sleep. This sleeping can be a
correctness bug from atomic contexts, and even in non-atomic contexts
it might introduce unacceptable latencies. This commit therefore adds
kvfree_rcu_mightsleep() and kfree_rcu_mightsleep(), which will replace
the single-argument kvfree_rcu() and kfree_rcu(), respectively.
This commit enables a series of commits that switch from single-argument
kvfree_rcu() and kfree_rcu() to their _mightsleep() counterparts. Once
all of these commits land, the single-argument versions will be removed.
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 822ff7b4bb1ed..094321c17e48a 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -1013,6 +1013,9 @@ static inline notrace void rcu_read_unlock_sched_notrace(void)
#define kvfree_rcu(...) KVFREE_GET_MACRO(__VA_ARGS__, \
kvfree_rcu_arg_2, kvfree_rcu_arg_1)(__VA_ARGS__)
+#define kvfree_rcu_mightsleep(ptr) kvfree_rcu_arg_1(ptr)
+#define kfree_rcu_mightsleep(ptr) kvfree_rcu_mightsleep(ptr)
+
#define KVFREE_GET_MACRO(_1, _2, NAME, ...) NAME
#define kvfree_rcu_arg_2(ptr, rhf) \
do { \
On Wed, Feb 01, 2023 at 11:12:11AM -0800, Paul E. McKenney wrote:
> On Wed, Feb 01, 2023 at 04:08:06PM +0100, Uladzislau Rezki (Sony) wrote:
> > This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
> > it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
> >
> > 1.
> > The problem is that, recently we have run into a precedent when
> > a user intended to give a second argument to kfree_rcu() API but
> > forgot to do it in a code so a call became as a single argument
> > of kfree_rcu() API.
> >
> > 2.
> > Such mistyping can lead to hidden bags where sleeping is forbidden.
> >
> > 3.
> > _mightsleep() prefix gives much more information for which contexts
> > it can be used for.
>
> Thank you!!!
>
> I have queued these (aside from 10/13, which is being replaced by a
> patch from Julian Anastasov) for further testing and review. And with
> the usual wordsmithing.
>
Good. 10/13 will go with two arguments. So it is not needed. Just for
confirmation.
BTW, i see two complains from the robot due to 10/13 patch uses an old API
name.
>
> If testing goes well, I might try to get 1/13 into the next merge window,
> which would simplify things if maintainers want to take their patches
> separately.
>
Thanks!
--
Uladzislau Rezki
On Thu, Feb 02, 2023 at 04:54:15PM +0100, Uladzislau Rezki wrote:
> On Wed, Feb 01, 2023 at 11:12:11AM -0800, Paul E. McKenney wrote:
> > On Wed, Feb 01, 2023 at 04:08:06PM +0100, Uladzislau Rezki (Sony) wrote:
> > > This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
> > > it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
> > >
> > > 1.
> > > The problem is that, recently we have run into a precedent when
> > > a user intended to give a second argument to kfree_rcu() API but
> > > forgot to do it in a code so a call became as a single argument
> > > of kfree_rcu() API.
> > >
> > > 2.
> > > Such mistyping can lead to hidden bags where sleeping is forbidden.
> > >
> > > 3.
> > > _mightsleep() prefix gives much more information for which contexts
> > > it can be used for.
> >
> > Thank you!!!
> >
> > I have queued these (aside from 10/13, which is being replaced by a
> > patch from Julian Anastasov) for further testing and review. And with
> > the usual wordsmithing.
> >
> Good. 10/13 will go with two arguments. So it is not needed. Just for
> confirmation.
Got it, thank you!
> BTW, i see two complains from the robot due to 10/13 patch uses an old API
> name.
Thank you for the heads up! I will be careful not to expose the last in
the series to -next before they get something in. Unless they take too
long. ;-)
Thanx, Paul
> > If testing goes well, I might try to get 1/13 into the next merge window,
> > which would simplify things if maintainers want to take their patches
> > separately.
> >
> Thanks!
>
> --
> Uladzislau Rezki
On Wed, Feb 01, 2023 at 04:08:06PM +0100, Uladzislau Rezki (Sony) wrote:
> This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
> it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
>
> 1.
> The problem is that, recently we have run into a precedent when
> a user intended to give a second argument to kfree_rcu() API but
> forgot to do it in a code so a call became as a single argument
> of kfree_rcu() API.
>
> 2.
> Such mistyping can lead to hidden bags where sleeping is forbidden.
>
> 3.
> _mightsleep() prefix gives much more information for which contexts
> it can be used for.
>
> Uladzislau Rezki (Sony) (13):
> rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
> drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> RDMA/rxe: Rename kfree_rcu() to kfree_rcu_mightsleep()
> net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
> ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
> ipvs: Rename kfree_rcu() to kfree_rcu_mightsleep()
> rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
> doc: Update whatisRCU.rst
> rcu/kvfree: Eliminate k[v]free_rcu() single argument macro
Hi,
Not sure if you guys noticed but on latest rcu/dev:
net/netfilter/ipvs/ip_vs_est.c: In function ‘ip_vs_stop_estimator’:
net/netfilter/ipvs/ip_vs_est.c:552:15: error: macro "kfree_rcu" requires 2 arguments, but only 1 given
kfree_rcu(td);
^
net/netfilter/ipvs/ip_vs_est.c:552:3: error: ‘kfree_rcu’ undeclared (first use in this function); did you mean ‘kfree_skb’?
kfree_rcu(td);
^~~~~~~~~
kfree_skb
net/netfilter/ipvs/ip_vs_est.c:552:3: note: each undeclared identifier is reported only once for each function it appears in
Thanks.
> From: Frederic Weisbecker <[email protected]>
> Sent: Thursday, February 23, 2023 8:45 PM
> To: Uladzislau Rezki (Sony) <[email protected]>
> Cc: LKML <[email protected]>; RCU <[email protected]>; Paul
> E . McKenney <[email protected]>; Oleksiy Avramchenko
> <[email protected]>; Jens Axboe <[email protected]>; Philipp
> Reisner <[email protected]>; Bryan Tan <[email protected]>;
> Steven Rostedt <[email protected]>; Eric Dumazet
> <[email protected]>; Bob Pearson <[email protected]>; Ariel
> Levkovich <[email protected]>; Theodore Ts'o <[email protected]>; Julian
> Anastasov <[email protected]>
> Subject: Re: [PATCH 00/13] Rename k[v]free_rcu() single argument to
> k[v]free_rcu_mightsleep()
>
> On Wed, Feb 01, 2023 at 04:08:06PM +0100, Uladzislau Rezki (Sony) wrote:
> > This small series is based on Paul's "dev" branch. Head is
> > 6002817348a1c610dc1b1c01ff81654cdec12be4
> > it renames a single argument of k[v]free_rcu() to its new
> k[v]free_rcu_mightsleep() name.
> >
> > 1.
> > The problem is that, recently we have run into a precedent when a user
> > intended to give a second argument to kfree_rcu() API but forgot to do
> > it in a code so a call became as a single argument of kfree_rcu() API.
> >
> > 2.
> > Such mistyping can lead to hidden bags where sleeping is forbidden.
> >
> > 3.
> > _mightsleep() prefix gives much more information for which contexts it
> > can be used for.
> >
> > Uladzislau Rezki (Sony) (13):
> > rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
> > drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > RDMA/rxe: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > ipvs: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > doc: Update whatisRCU.rst
> > rcu/kvfree: Eliminate k[v]free_rcu() single argument macro
>
> Hi,
>
> Not sure if you guys noticed but on latest rcu/dev:
>
> net/netfilter/ipvs/ip_vs_est.c: In function ‘ip_vs_stop_estimator’:
> net/netfilter/ipvs/ip_vs_est.c:552:15: error: macro "kfree_rcu" requires 2
> arguments, but only 1 given
> kfree_rcu(td);
> ^
> net/netfilter/ipvs/ip_vs_est.c:552:3: error: ‘kfree_rcu’ undeclared (first use in
> this function); did you mean ‘kfree_skb’?
> kfree_rcu(td);
> ^~~~~~~~~
> kfree_skb
> net/netfilter/ipvs/ip_vs_est.c:552:3: note: each undeclared identifier is
> reported only once for each function it appears in
>
Hi Frederic Weisbecker,
I encountered the same build error as yours.
Per the discussion link below, the fix for this build error by Uladzislau Rezki will be picked up by some other maintainer's branch?
@Paul E . McKenney, please correct me if my understanding is wrong. ????
https://lore.kernel.org/rcu/Y9qc+lgR1CgdszKs@salvia/
-Qiuxu
On 2/1/23 8:08 AM, Uladzislau Rezki (Sony) wrote:
> This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
> it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
>
> 1.
> The problem is that, recently we have run into a precedent when
> a user intended to give a second argument to kfree_rcu() API but
> forgot to do it in a code so a call became as a single argument
> of kfree_rcu() API.
>
> 2.
> Such mistyping can lead to hidden bags where sleeping is forbidden.
>
> 3.
> _mightsleep() prefix gives much more information for which contexts
> it can be used for.
This patchset seems weird to me. We have a LOT of calls that might
sleep, yet we don't suffix them all with _mightsleep(). Why is this
any different? Why isn't this just a might_sleep() call in the
actual helper, which will suffice for checkers and catch it at
runtime as well.
--
Jens Axboe
On Thu, Feb 23, 2023 at 02:29:42PM +0000, Zhuo, Qiuxu wrote:
> > From: Frederic Weisbecker <[email protected]>
> > Sent: Thursday, February 23, 2023 8:45 PM
> > To: Uladzislau Rezki (Sony) <[email protected]>
> > Cc: LKML <[email protected]>; RCU <[email protected]>; Paul
> > E . McKenney <[email protected]>; Oleksiy Avramchenko
> > <[email protected]>; Jens Axboe <[email protected]>; Philipp
> > Reisner <[email protected]>; Bryan Tan <[email protected]>;
> > Steven Rostedt <[email protected]>; Eric Dumazet
> > <[email protected]>; Bob Pearson <[email protected]>; Ariel
> > Levkovich <[email protected]>; Theodore Ts'o <[email protected]>; Julian
> > Anastasov <[email protected]>
> > Subject: Re: [PATCH 00/13] Rename k[v]free_rcu() single argument to
> > k[v]free_rcu_mightsleep()
> >
> > On Wed, Feb 01, 2023 at 04:08:06PM +0100, Uladzislau Rezki (Sony) wrote:
> > > This small series is based on Paul's "dev" branch. Head is
> > > 6002817348a1c610dc1b1c01ff81654cdec12be4
> > > it renames a single argument of k[v]free_rcu() to its new
> > k[v]free_rcu_mightsleep() name.
> > >
> > > 1.
> > > The problem is that, recently we have run into a precedent when a user
> > > intended to give a second argument to kfree_rcu() API but forgot to do
> > > it in a code so a call became as a single argument of kfree_rcu() API.
> > >
> > > 2.
> > > Such mistyping can lead to hidden bags where sleeping is forbidden.
> > >
> > > 3.
> > > _mightsleep() prefix gives much more information for which contexts it
> > > can be used for.
> > >
> > > Uladzislau Rezki (Sony) (13):
> > > rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
> > > drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > > misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > > tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > > lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > > net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
> > > RDMA/rxe: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > > net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > > ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > > ipvs: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > > rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
> > > doc: Update whatisRCU.rst
> > > rcu/kvfree: Eliminate k[v]free_rcu() single argument macro
> >
> > Hi,
> >
> > Not sure if you guys noticed but on latest rcu/dev:
> >
> > net/netfilter/ipvs/ip_vs_est.c: In function ‘ip_vs_stop_estimator’:
> > net/netfilter/ipvs/ip_vs_est.c:552:15: error: macro "kfree_rcu" requires 2
> > arguments, but only 1 given
> > kfree_rcu(td);
> > ^
> > net/netfilter/ipvs/ip_vs_est.c:552:3: error: ‘kfree_rcu’ undeclared (first use in
> > this function); did you mean ‘kfree_skb’?
> > kfree_rcu(td);
> > ^~~~~~~~~
> > kfree_skb
> > net/netfilter/ipvs/ip_vs_est.c:552:3: note: each undeclared identifier is
> > reported only once for each function it appears in
>
> Hi Frederic Weisbecker,
>
> I encountered the same build error as yours.
> Per the discussion link below, the fix for this build error by Uladzislau Rezki will be picked up by some other maintainer's branch?
> @Paul E . McKenney, please correct me if my understanding is wrong. ????
>
> https://lore.kernel.org/rcu/Y9qc+lgR1CgdszKs@salvia/
Pablo and Julian, how are things coming with that patch?
Thanx, Paul
Hello,
On Thu, 23 Feb 2023, Paul E. McKenney wrote:
> > > Not sure if you guys noticed but on latest rcu/dev:
> > >
> > > net/netfilter/ipvs/ip_vs_est.c: In function ‘ip_vs_stop_estimator’:
> > > net/netfilter/ipvs/ip_vs_est.c:552:15: error: macro "kfree_rcu" requires 2
> > > arguments, but only 1 given
> > > kfree_rcu(td);
> > > ^
> > > net/netfilter/ipvs/ip_vs_est.c:552:3: error: ‘kfree_rcu’ undeclared (first use in
> > > this function); did you mean ‘kfree_skb’?
> > > kfree_rcu(td);
> > > ^~~~~~~~~
> > > kfree_skb
> > > net/netfilter/ipvs/ip_vs_est.c:552:3: note: each undeclared identifier is
> > > reported only once for each function it appears in
> >
> > Hi Frederic Weisbecker,
> >
> > I encountered the same build error as yours.
> > Per the discussion link below, the fix for this build error by Uladzislau Rezki will be picked up by some other maintainer's branch?
> > @Paul E . McKenney, please correct me if my understanding is wrong. ????
> >
> > https://lore.kernel.org/rcu/Y9qc+lgR1CgdszKs@salvia/
>
> Pablo and Julian, how are things coming with that patch?
Fix is already in net and net-next tree
Regards
--
Julian Anastasov <[email protected]>
On Thu, Feb 23, 2023 at 06:21:46PM +0200, Julian Anastasov wrote:
>
> Hello,
>
> On Thu, 23 Feb 2023, Paul E. McKenney wrote:
>
> > > > Not sure if you guys noticed but on latest rcu/dev:
> > > >
> > > > net/netfilter/ipvs/ip_vs_est.c: In function ‘ip_vs_stop_estimator’:
> > > > net/netfilter/ipvs/ip_vs_est.c:552:15: error: macro "kfree_rcu" requires 2
> > > > arguments, but only 1 given
> > > > kfree_rcu(td);
> > > > ^
> > > > net/netfilter/ipvs/ip_vs_est.c:552:3: error: ‘kfree_rcu’ undeclared (first use in
> > > > this function); did you mean ‘kfree_skb’?
> > > > kfree_rcu(td);
> > > > ^~~~~~~~~
> > > > kfree_skb
> > > > net/netfilter/ipvs/ip_vs_est.c:552:3: note: each undeclared identifier is
> > > > reported only once for each function it appears in
> > >
> > > Hi Frederic Weisbecker,
> > >
> > > I encountered the same build error as yours.
> > > Per the discussion link below, the fix for this build error by Uladzislau Rezki will be picked up by some other maintainer's branch?
> > > @Paul E . McKenney, please correct me if my understanding is wrong. ????
> > >
> > > https://lore.kernel.org/rcu/Y9qc+lgR1CgdszKs@salvia/
> >
> > Pablo and Julian, how are things coming with that patch?
>
> Fix is already in net and net-next tree
Very good, thank you! Is this going into this merge window or is it
expected to wait for v6.4?
Thanx, Paul
On Thu, Feb 23, 2023 at 09:14:32AM -0800, Paul E. McKenney wrote:
> On Thu, Feb 23, 2023 at 06:21:46PM +0200, Julian Anastasov wrote:
> >
> > Hello,
> >
> > On Thu, 23 Feb 2023, Paul E. McKenney wrote:
> >
> > > > > Not sure if you guys noticed but on latest rcu/dev:
> > > > >
> > > > > net/netfilter/ipvs/ip_vs_est.c: In function ‘ip_vs_stop_estimator’:
> > > > > net/netfilter/ipvs/ip_vs_est.c:552:15: error: macro "kfree_rcu" requires 2
> > > > > arguments, but only 1 given
> > > > > kfree_rcu(td);
> > > > > ^
> > > > > net/netfilter/ipvs/ip_vs_est.c:552:3: error: ‘kfree_rcu’ undeclared (first use in
> > > > > this function); did you mean ‘kfree_skb’?
> > > > > kfree_rcu(td);
> > > > > ^~~~~~~~~
> > > > > kfree_skb
> > > > > net/netfilter/ipvs/ip_vs_est.c:552:3: note: each undeclared identifier is
> > > > > reported only once for each function it appears in
> > > >
> > > > Hi Frederic Weisbecker,
> > > >
> > > > I encountered the same build error as yours.
> > > > Per the discussion link below, the fix for this build error by Uladzislau Rezki will be picked up by some other maintainer's branch?
> > > > @Paul E . McKenney, please correct me if my understanding is wrong. ????
> > > >
> > > > https://lore.kernel.org/rcu/Y9qc+lgR1CgdszKs@salvia/
> > >
> > > Pablo and Julian, how are things coming with that patch?
> >
> > Fix is already in net and net-next tree
>
> Very good, thank you! Is this going into this merge window or is it
> expected to wait for v6.4?
this merge window. Thanx, Paul
On Thu, Feb 23, 2023 at 06:36:26PM +0100, Pablo Neira Ayuso wrote:
> On Thu, Feb 23, 2023 at 09:14:32AM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 23, 2023 at 06:21:46PM +0200, Julian Anastasov wrote:
> > >
> > > Hello,
> > >
> > > On Thu, 23 Feb 2023, Paul E. McKenney wrote:
> > >
> > > > > > Not sure if you guys noticed but on latest rcu/dev:
> > > > > >
> > > > > > net/netfilter/ipvs/ip_vs_est.c: In function ‘ip_vs_stop_estimator’:
> > > > > > net/netfilter/ipvs/ip_vs_est.c:552:15: error: macro "kfree_rcu" requires 2
> > > > > > arguments, but only 1 given
> > > > > > kfree_rcu(td);
> > > > > > ^
> > > > > > net/netfilter/ipvs/ip_vs_est.c:552:3: error: ‘kfree_rcu’ undeclared (first use in
> > > > > > this function); did you mean ‘kfree_skb’?
> > > > > > kfree_rcu(td);
> > > > > > ^~~~~~~~~
> > > > > > kfree_skb
> > > > > > net/netfilter/ipvs/ip_vs_est.c:552:3: note: each undeclared identifier is
> > > > > > reported only once for each function it appears in
> > > > >
> > > > > Hi Frederic Weisbecker,
> > > > >
> > > > > I encountered the same build error as yours.
> > > > > Per the discussion link below, the fix for this build error by Uladzislau Rezki will be picked up by some other maintainer's branch?
> > > > > @Paul E . McKenney, please correct me if my understanding is wrong. ????
> > > > >
> > > > > https://lore.kernel.org/rcu/Y9qc+lgR1CgdszKs@salvia/
> > > >
> > > > Pablo and Julian, how are things coming with that patch?
> > >
> > > Fix is already in net and net-next tree
> >
> > Very good, thank you! Is this going into this merge window or is it
> > expected to wait for v6.4?
>
> this merge window. Thanx, Paul
Thank you, and looking forward to it!
Thanx, Paul
On Thu, Feb 23, 2023 at 07:57:13AM -0700, Jens Axboe wrote:
> On 2/1/23 8:08 AM, Uladzislau Rezki (Sony) wrote:
> > This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
> > it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
> >
> > 1.
> > The problem is that, recently we have run into a precedent when
> > a user intended to give a second argument to kfree_rcu() API but
> > forgot to do it in a code so a call became as a single argument
> > of kfree_rcu() API.
> >
> > 2.
> > Such mistyping can lead to hidden bags where sleeping is forbidden.
> >
> > 3.
> > _mightsleep() prefix gives much more information for which contexts
> > it can be used for.
>
> This patchset seems weird to me. We have a LOT of calls that might
> sleep, yet we don't suffix them all with _mightsleep(). Why is this
> any different? Why isn't this just a might_sleep() call in the
> actual helper, which will suffice for checkers and catch it at
> runtime as well.
Fair enough, and the situation that this patchset is addressing is also a
bit unusual. This change was requested by Eric Dumazet due to a situation
where someone forgot the optional second argument to kfree_rcu(). Now,
you are right that this would be caught if invoked from a non-sleepable
context, but there are also cases where sleeping is legal, but where the
occasional wait for an RCU grace period would be a problem. The checkers
cannot easily catch this sort of thing, and hence the change in name.
Hey, the combined one/two-argument form seemed like a good idea at
the time! ;-)
Thanx, Paul
On 2/23/23 11:31 AM, Paul E. McKenney wrote:
> On Thu, Feb 23, 2023 at 07:57:13AM -0700, Jens Axboe wrote:
>> On 2/1/23 8:08 AM, Uladzislau Rezki (Sony) wrote:
>>> This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
>>> it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
>>>
>>> 1.
>>> The problem is that, recently we have run into a precedent when
>>> a user intended to give a second argument to kfree_rcu() API but
>>> forgot to do it in a code so a call became as a single argument
>>> of kfree_rcu() API.
>>>
>>> 2.
>>> Such mistyping can lead to hidden bags where sleeping is forbidden.
>>>
>>> 3.
>>> _mightsleep() prefix gives much more information for which contexts
>>> it can be used for.
>>
>> This patchset seems weird to me. We have a LOT of calls that might
>> sleep, yet we don't suffix them all with _mightsleep(). Why is this
>> any different? Why isn't this just a might_sleep() call in the
>> actual helper, which will suffice for checkers and catch it at
>> runtime as well.
>
> Fair enough, and the situation that this patchset is addressing is also a
> bit unusual. This change was requested by Eric Dumazet due to a situation
> where someone forgot the optional second argument to kfree_rcu(). Now,
> you are right that this would be caught if invoked from a non-sleepable
> context, but there are also cases where sleeping is legal, but where the
> occasional wait for an RCU grace period would be a problem. The checkers
> cannot easily catch this sort of thing, and hence the change in name.
>
> Hey, the combined one/two-argument form seemed like a good idea at
> the time! ;-)
Hah, not sure what you were smoking back then!
--
Jens Axboe
On Thu, Feb 23, 2023 at 12:36:57PM -0700, Jens Axboe wrote:
> On 2/23/23 11:31 AM, Paul E. McKenney wrote:
> > On Thu, Feb 23, 2023 at 07:57:13AM -0700, Jens Axboe wrote:
> >> On 2/1/23 8:08 AM, Uladzislau Rezki (Sony) wrote:
> >>> This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
> >>> it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
> >>>
> >>> 1.
> >>> The problem is that, recently we have run into a precedent when
> >>> a user intended to give a second argument to kfree_rcu() API but
> >>> forgot to do it in a code so a call became as a single argument
> >>> of kfree_rcu() API.
> >>>
> >>> 2.
> >>> Such mistyping can lead to hidden bags where sleeping is forbidden.
> >>>
> >>> 3.
> >>> _mightsleep() prefix gives much more information for which contexts
> >>> it can be used for.
> >>
> >> This patchset seems weird to me. We have a LOT of calls that might
> >> sleep, yet we don't suffix them all with _mightsleep(). Why is this
> >> any different? Why isn't this just a might_sleep() call in the
> >> actual helper, which will suffice for checkers and catch it at
> >> runtime as well.
> >
> > Fair enough, and the situation that this patchset is addressing is also a
> > bit unusual. This change was requested by Eric Dumazet due to a situation
> > where someone forgot the optional second argument to kfree_rcu(). Now,
> > you are right that this would be caught if invoked from a non-sleepable
> > context, but there are also cases where sleeping is legal, but where the
> > occasional wait for an RCU grace period would be a problem. The checkers
> > cannot easily catch this sort of thing, and hence the change in name.
> >
> > Hey, the combined one/two-argument form seemed like a good idea at
> > the time! ;-)
>
> Hah, not sure what you were smoking back then!
If I remember, would you like me to send you some? ;-)
Thanx, Paul
On 2/23/23 12:47?PM, Paul E. McKenney wrote:
>>> Hey, the combined one/two-argument form seemed like a good idea at
>>> the time! ;-)
>>
>> Hah, not sure what you were smoking back then!
>
> If I remember, would you like me to send you some? ;-)
Definitely, might improve my reviews :-)
--
Jens Axboe
> The kvfree_rcu()'s single argument name is deprecated therefore
> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> underline that it is for sleepable contexts.
>
> Cc: Jens Axboe <[email protected]>
> Cc: Philipp Reisner <[email protected]>
> Cc: Lars Ellenberg <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> ---
> drivers/block/drbd/drbd_nl.c | 6 +++---
> drivers/block/drbd/drbd_receiver.c | 4 ++--
> drivers/block/drbd/drbd_state.c | 2 +-
> 3 files changed, 6 insertions(+), 6 deletions(-)
>
Could you please add you reviwed-by or Acked-by tags so we can bring
our series with renaming for the next merge window?
Thanks!
--
Uladzislau Rezki
> The kvfree_rcu()'s single argument name is deprecated therefore
> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> underline that it is for sleepable contexts.
>
> Cc: Bryan Tan <[email protected]>
> Cc: Vishnu Dasa <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
>
Could you please add you reviwed-by or Acked-by tags so we can bring
our series with renaming for the next merge window?
Thanks!
--
Uladzislau Rezki
> The kfree_rcu()'s single argument name is deprecated therefore
> rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
> underline that it is for sleepable contexts.
>
> Cc: Theodore Ts'o <[email protected]>
> Cc: Lukas Czerner <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
>
Could you please add you reviwed-by or Acked-by tags so we can bring
our series with renaming for the next merge window?
Thanks!
--
Uladzislau Rezki
> The kvfree_rcu()'s single argument name is deprecated therefore
> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> underline that it is for sleepable contexts.
>
> Cc: Steven Rostedt (VMware) <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
>
Could you please add you reviwed-by or Acked-by tags so we can bring
our series with renaming for the next merge window?
Thanks!
--
Uladzisslau Rezki
On Wed, Feb 01, 2023 at 04:08:13PM +0100, Uladzislau Rezki (Sony) wrote:
> The kfree_rcu()'s single argument name is deprecated therefore
> rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
> underline that it is for sleepable contexts.
>
> Please check the RXE driver in a way that a single argument can
> be used. Briefly looking at it and rcu_head should be embed to
> free an obj over RCU-core. The context might be atomic.
>
> Cc: Bob Pearson <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> ---
> drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
Could you please add you reviwed-by or Acked-by tags so we can bring
our series with renaming for the next merge window?
Thanks!
--
Uladzislau Rezki
On Wed, Feb 01, 2023 at 04:08:12PM +0100, Uladzislau Rezki (Sony) wrote:
> The kvfree_rcu()'s single argument name is deprecated therefore
> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> underline that it is for sleepable contexts.
>
> Cc: Eric Dumazet <[email protected]>
> Cc: David S. Miller <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> ---
> net/core/sysctl_net_core.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
Could you please add you reviwed-by or Acked-by tags so we can bring
our series with renaming for the next merge window?
Thanks!
--
Uladzislau Rezki
> The kvfree_rcu()'s single argument name is deprecated therefore
> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> underline that it is for sleepable contexts.
>
> Cc: Eric Dumazet <[email protected]>
> Cc: David S. Miller <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> ---
> net/core/sysctl_net_core.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
Could you please add you reviwed-by or Acked-by tags so we can bring
our series with renaming for the next merge window?
Thanks!
--
Uladzislau Rezki
> On Wed, Feb 01, 2023 at 04:08:13PM +0100, Uladzislau Rezki (Sony) wrote:
> > The kfree_rcu()'s single argument name is deprecated therefore
> > rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
> > underline that it is for sleepable contexts.
> >
> > Please check the RXE driver in a way that a single argument can
> > be used. Briefly looking at it and rcu_head should be embed to
> > free an obj over RCU-core. The context might be atomic.
> >
> > Cc: Bob Pearson <[email protected]>
> > Cc: Jason Gunthorpe <[email protected]>
> > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> > ---
> > drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> Could you please add you reviwed-by or Acked-by tags so we can bring
> our series with renaming for the next merge window?
>
> Thanks!
>
__rxe_cleanup() can be called in two contexts, sleepable and not.
Therefore usage of a single argument of the kvfree_rcu() is not correct
here.
Could you please fix and check your driver? If my above statement
is not correct, please provide Acked-by or Reviwed-by tags to the
path that is in question.
Otherwise please add an rcu_head in your data to free objects over
kvfree_rcu() using double argument API.
Could you please support?
--
Uladzislau Rezki
> On Feb 1, 2023, at 7:08 AM, Uladzislau Rezki (Sony) <[email protected]> wrote:
>
> The kvfree_rcu()'s single argument name is deprecated therefore
> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> underline that it is for sleepable contexts.
>
> Cc: Bryan Tan <[email protected]>
> Cc: Vishnu Dasa <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Reviewed-by: Vishnu Dasa <[email protected]>
> ---
> drivers/misc/vmw_vmci/vmci_context.c | 2 +-
> drivers/misc/vmw_vmci/vmci_event.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/misc/vmw_vmci/vmci_context.c b/drivers/misc/vmw_vmci/vmci_context.c
> index 172696abce31..f22b44827e92 100644
> --- a/drivers/misc/vmw_vmci/vmci_context.c
> +++ b/drivers/misc/vmw_vmci/vmci_context.c
> @@ -687,7 +687,7 @@ int vmci_ctx_remove_notification(u32 context_id, u32 remote_cid)
> spin_unlock(&context->lock);
>
> if (notifier)
> - kvfree_rcu(notifier);
> + kvfree_rcu_mightsleep(notifier);
>
> vmci_ctx_put(context);
>
> diff --git a/drivers/misc/vmw_vmci/vmci_event.c b/drivers/misc/vmw_vmci/vmci_event.c
> index 2100297c94ad..5d7ac07623c2 100644
> --- a/drivers/misc/vmw_vmci/vmci_event.c
> +++ b/drivers/misc/vmw_vmci/vmci_event.c
> @@ -209,7 +209,7 @@ int vmci_event_unsubscribe(u32 sub_id)
> if (!s)
> return VMCI_ERROR_NOT_FOUND;
>
> - kvfree_rcu(s);
> + kvfree_rcu_mightsleep(s);
>
> return VMCI_SUCCESS;
> }
> --
> 2.30.2
>
On Thu, Mar 09, 2023 at 03:13:08PM +0100, Uladzislau Rezki wrote:
> > On Wed, Feb 01, 2023 at 04:08:13PM +0100, Uladzislau Rezki (Sony) wrote:
> > > The kfree_rcu()'s single argument name is deprecated therefore
> > > rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
> > > underline that it is for sleepable contexts.
> > >
> > > Please check the RXE driver in a way that a single argument can
> > > be used. Briefly looking at it and rcu_head should be embed to
> > > free an obj over RCU-core. The context might be atomic.
> > >
> > > Cc: Bob Pearson <[email protected]>
> > > Cc: Jason Gunthorpe <[email protected]>
> > > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> > > ---
> > > drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > Could you please add you reviwed-by or Acked-by tags so we can bring
> > our series with renaming for the next merge window?
> >
> > Thanks!
> >
> __rxe_cleanup() can be called in two contexts, sleepable and not.
> Therefore usage of a single argument of the kvfree_rcu() is not correct
> here.
>
> Could you please fix and check your driver? If my above statement
> is not correct, please provide Acked-by or Reviwed-by tags to the
> path that is in question.
>
> Otherwise please add an rcu_head in your data to free objects over
> kvfree_rcu() using double argument API.
>
> Could you please support?
Also this one needs renaming? It came in because of the commit in 6.3-rc1:
72a03627443d ("RDMA/rxe: Remove rxe_alloc()")
It could be squashed into this patch itself since it is infiniband related.
Paul noticed that this breaks dropping the old API on -next, so it is
blocking the renaming.
---8<-----------------------
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index b10aa1580a64..ae3a100e18fb 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -731,7 +731,7 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
return -EINVAL;
rxe_cleanup(mr);
- kfree_rcu(mr);
+ kfree_rcu_mightsleep(mr);
return 0;
}
On 3/9/23 18:55, Joel Fernandes wrote:
> On Thu, Mar 09, 2023 at 03:13:08PM +0100, Uladzislau Rezki wrote:
>>> On Wed, Feb 01, 2023 at 04:08:13PM +0100, Uladzislau Rezki (Sony) wrote:
>>>> The kfree_rcu()'s single argument name is deprecated therefore
>>>> rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
>>>> underline that it is for sleepable contexts.
>>>>
>>>> Please check the RXE driver in a way that a single argument can
>>>> be used. Briefly looking at it and rcu_head should be embed to
>>>> free an obj over RCU-core. The context might be atomic.
>>>>
>>>> Cc: Bob Pearson <[email protected]>
>>>> Cc: Jason Gunthorpe <[email protected]>
>>>> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
>>>> ---
>>>> drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>> Could you please add you reviwed-by or Acked-by tags so we can bring
>>> our series with renaming for the next merge window?
>>>
>>> Thanks!
>>>
>> __rxe_cleanup() can be called in two contexts, sleepable and not.
>> Therefore usage of a single argument of the kvfree_rcu() is not correct
>> here.
>>
>> Could you please fix and check your driver? If my above statement
>> is not correct, please provide Acked-by or Reviwed-by tags to the
>> path that is in question.
>>
>> Otherwise please add an rcu_head in your data to free objects over
>> kvfree_rcu() using double argument API.
>>
>> Could you please support?
>
> Also this one needs renaming? It came in because of the commit in 6.3-rc1:
> 72a03627443d ("RDMA/rxe: Remove rxe_alloc()")
>
> It could be squashed into this patch itself since it is infiniband related.
>
> Paul noticed that this breaks dropping the old API on -next, so it is
> blocking the renaming.
>
> ---8<-----------------------
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index b10aa1580a64..ae3a100e18fb 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -731,7 +731,7 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
> return -EINVAL;
>
> rxe_cleanup(mr);
> - kfree_rcu(mr);
> + kfree_rcu_mightsleep(mr);
> return 0;
> }
>
I just got back from a 1 week vacation and missed all this.
The "RDMA/rxe: Remove rxe_alloc()" patch just moved the memory allocation for MR (verbs) objects outside
of the rxe_pool code since it only applied to MRs and not the other verbs objects (AH, QP, CQ, ...).
That code has to handle a unique situation for AH objects which can be created or destroyed by connection
manager code in atomic context while all the other ones including MRs are always created/destroyed in process
context. All objects other than MR's are created/destroyed in the rdma-core code (drivers/infiniband/core).
The rxe driver keeps xarray's of pointers to the various objects which are protected by rcu locking and so
it made sense to use kfree_rcu to delete the object with a delay. In the MR case ..._mightsleep seems harmless
and should not be an issue.
However on reflection, all the references to the MR objects are ref counted and they have been dropped before
reaching the kfree and so there really never was a good reason to use kfree_rcu in the first place. So
a better solution would be to replace kfree_rcu with kfree. There is a timeout in completion_done() that
triggers a WARN_ON() and this is only seen if the driver is broken for some reason but that is equivalent to
getting a seg fault so no reason to further delay the kfree.
Reviewed-by: Bob Pearson <[email protected]>
On Thu, Mar 9, 2023 at 10:13 PM Uladzislau Rezki <[email protected]> wrote:
>
> > On Wed, Feb 01, 2023 at 04:08:13PM +0100, Uladzislau Rezki (Sony) wrote:
> > > The kfree_rcu()'s single argument name is deprecated therefore
> > > rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
> > > underline that it is for sleepable contexts.
> > >
> > > Please check the RXE driver in a way that a single argument can
> > > be used. Briefly looking at it and rcu_head should be embed to
> > > free an obj over RCU-core. The context might be atomic.
> > >
> > > Cc: Bob Pearson <[email protected]>
> > > Cc: Jason Gunthorpe <[email protected]>
> > > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Thanks.
Acked-by: Zhu Yanjun <[email protected]>
Zhu Yanjun
> > > ---
> > > drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > Could you please add you reviwed-by or Acked-by tags so we can bring
> > our series with renaming for the next merge window?
> >
> > Thanks!
> >
> __rxe_cleanup() can be called in two contexts, sleepable and not.
> Therefore usage of a single argument of the kvfree_rcu() is not correct
> here.
>
> Could you please fix and check your driver? If my above statement
> is not correct, please provide Acked-by or Reviwed-by tags to the
> path that is in question.
>
> Otherwise please add an rcu_head in your data to free objects over
> kvfree_rcu() using double argument API.
>
> Could you please support?
>
> --
> Uladzislau Rezki
On Mon, Mar 13, 2023 at 02:43:43PM -0500, Bob Pearson wrote:
> On 3/9/23 18:55, Joel Fernandes wrote:
> > On Thu, Mar 09, 2023 at 03:13:08PM +0100, Uladzislau Rezki wrote:
> >>> On Wed, Feb 01, 2023 at 04:08:13PM +0100, Uladzislau Rezki (Sony) wrote:
> >>>> The kfree_rcu()'s single argument name is deprecated therefore
> >>>> rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
> >>>> underline that it is for sleepable contexts.
> >>>>
> >>>> Please check the RXE driver in a way that a single argument can
> >>>> be used. Briefly looking at it and rcu_head should be embed to
> >>>> free an obj over RCU-core. The context might be atomic.
> >>>>
> >>>> Cc: Bob Pearson <[email protected]>
> >>>> Cc: Jason Gunthorpe <[email protected]>
> >>>> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> >>>> ---
> >>>> drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>> Could you please add you reviwed-by or Acked-by tags so we can bring
> >>> our series with renaming for the next merge window?
> >>>
> >>> Thanks!
> >>>
> >> __rxe_cleanup() can be called in two contexts, sleepable and not.
> >> Therefore usage of a single argument of the kvfree_rcu() is not correct
> >> here.
> >>
> >> Could you please fix and check your driver? If my above statement
> >> is not correct, please provide Acked-by or Reviwed-by tags to the
> >> path that is in question.
> >>
> >> Otherwise please add an rcu_head in your data to free objects over
> >> kvfree_rcu() using double argument API.
> >>
> >> Could you please support?
> >
> > Also this one needs renaming? It came in because of the commit in 6.3-rc1:
> > 72a03627443d ("RDMA/rxe: Remove rxe_alloc()")
> >
> > It could be squashed into this patch itself since it is infiniband related.
> >
> > Paul noticed that this breaks dropping the old API on -next, so it is
> > blocking the renaming.
> >
> > ---8<-----------------------
> >
> > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> > index b10aa1580a64..ae3a100e18fb 100644
> > --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> > @@ -731,7 +731,7 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
> > return -EINVAL;
> >
> > rxe_cleanup(mr);
> > - kfree_rcu(mr);
> > + kfree_rcu_mightsleep(mr);
> > return 0;
> > }
> >
> I just got back from a 1 week vacation and missed all this.
>
> The "RDMA/rxe: Remove rxe_alloc()" patch just moved the memory allocation
> for MR (verbs) objects outside of the rxe_pool code since it only applied
> to MRs and not the other verbs objects (AH, QP, CQ, ...). That code has to
> handle a unique situation for AH objects which can be created or destroyed
> by connection manager code in atomic context while all the other ones
> including MRs are always created/destroyed in process context. All objects
> other than MR's are created/destroyed in the rdma-core code
> (drivers/infiniband/core).
>
> The rxe driver keeps xarray's of pointers to the various objects which are
> protected by rcu locking and so it made sense to use kfree_rcu to delete
> the object with a delay. In the MR case ..._mightsleep seems harmless and
> should not be an issue.
>
> However on reflection, all the references to the MR objects are ref counted
> and they have been dropped before reaching the kfree and so there really
> never was a good reason to use kfree_rcu in the first place. So a better
> solution would be to replace kfree_rcu with kfree. There is a timeout in
> completion_done() that triggers a WARN_ON() and this is only seen if the
> driver is broken for some reason but that is equivalent to getting a seg
> fault so no reason to further delay the kfree.
>
> Reviewed-by: Bob Pearson <[email protected]>
Thanks, I am planning to send the following patch for 6.4 consideration,
please let me know if you disagree. Still testing it.
----8<---
From: Joel Fernandes (Google) <[email protected]>
Subject: [PATCH] RDMA/rxe: Rename kfree_rcu() to kvfree_rcu_mightsleep()
The k[v]free_rcu() macro's single-argument form is deprecated.
Therefore switch to the new k[v]free_rcu_mightsleep() variant. The goal
is to avoid accidental use of the single-argument forms, which can
introduce functionality bugs in atomic contexts and latency bugs in
non-atomic contexts.
There is no functionality change with this patch.
Link: https://lore.kernel.org/rcu/[email protected]
Acked-by: Zhu Yanjun <[email protected]>
Reviewed-by: Bob Pearson <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Fixes: 72a03627443d ("RDMA/rxe: Remove rxe_alloc()")
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
drivers/infiniband/sw/rxe/rxe_mr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index b10aa1580a64..ae3a100e18fb 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -731,7 +731,7 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
return -EINVAL;
rxe_cleanup(mr);
- kfree_rcu(mr);
+ kfree_rcu_mightsleep(mr);
return 0;
}
--
2.40.0.rc1.284.g88254d51c5-goog
On 3/15/23 06:50, Joel Fernandes wrote:
> On Mon, Mar 13, 2023 at 02:43:43PM -0500, Bob Pearson wrote:
>> On 3/9/23 18:55, Joel Fernandes wrote:
>>> On Thu, Mar 09, 2023 at 03:13:08PM +0100, Uladzislau Rezki wrote:
>>>>> On Wed, Feb 01, 2023 at 04:08:13PM +0100, Uladzislau Rezki (Sony) wrote:
>>>>>> The kfree_rcu()'s single argument name is deprecated therefore
>>>>>> rename it to kfree_rcu_mightsleep() variant. The goal is explicitly
>>>>>> underline that it is for sleepable contexts.
>>>>>>
>>>>>> Please check the RXE driver in a way that a single argument can
>>>>>> be used. Briefly looking at it and rcu_head should be embed to
>>>>>> free an obj over RCU-core. The context might be atomic.
>>>>>>
>>>>>> Cc: Bob Pearson <[email protected]>
>>>>>> Cc: Jason Gunthorpe <[email protected]>
>>>>>> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
>>>>>> ---
>>>>>> drivers/infiniband/sw/rxe/rxe_pool.c | 2 +-
>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>> Could you please add you reviwed-by or Acked-by tags so we can bring
>>>>> our series with renaming for the next merge window?
>>>>>
>>>>> Thanks!
>>>>>
>>>> __rxe_cleanup() can be called in two contexts, sleepable and not.
>>>> Therefore usage of a single argument of the kvfree_rcu() is not correct
>>>> here.
>>>>
>>>> Could you please fix and check your driver? If my above statement
>>>> is not correct, please provide Acked-by or Reviwed-by tags to the
>>>> path that is in question.
>>>>
>>>> Otherwise please add an rcu_head in your data to free objects over
>>>> kvfree_rcu() using double argument API.
>>>>
>>>> Could you please support?
>>>
>>> Also this one needs renaming? It came in because of the commit in 6.3-rc1:
>>> 72a03627443d ("RDMA/rxe: Remove rxe_alloc()")
>>>
>>> It could be squashed into this patch itself since it is infiniband related.
>>>
>>> Paul noticed that this breaks dropping the old API on -next, so it is
>>> blocking the renaming.
>>>
>>> ---8<-----------------------
>>>
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
>>> index b10aa1580a64..ae3a100e18fb 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
>>> @@ -731,7 +731,7 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
>>> return -EINVAL;
>>>
>>> rxe_cleanup(mr);
>>> - kfree_rcu(mr);
>>> + kfree_rcu_mightsleep(mr);
>>> return 0;
>>> }
>>>
>> I just got back from a 1 week vacation and missed all this.
>>
>> The "RDMA/rxe: Remove rxe_alloc()" patch just moved the memory allocation
>> for MR (verbs) objects outside of the rxe_pool code since it only applied
>> to MRs and not the other verbs objects (AH, QP, CQ, ...). That code has to
>> handle a unique situation for AH objects which can be created or destroyed
>> by connection manager code in atomic context while all the other ones
>> including MRs are always created/destroyed in process context. All objects
>> other than MR's are created/destroyed in the rdma-core code
>> (drivers/infiniband/core).
>>
>> The rxe driver keeps xarray's of pointers to the various objects which are
>> protected by rcu locking and so it made sense to use kfree_rcu to delete
>> the object with a delay. In the MR case ..._mightsleep seems harmless and
>> should not be an issue.
>>
>> However on reflection, all the references to the MR objects are ref counted
>> and they have been dropped before reaching the kfree and so there really
>> never was a good reason to use kfree_rcu in the first place. So a better
>> solution would be to replace kfree_rcu with kfree. There is a timeout in
>> completion_done() that triggers a WARN_ON() and this is only seen if the
>> driver is broken for some reason but that is equivalent to getting a seg
>> fault so no reason to further delay the kfree.
>>
>> Reviewed-by: Bob Pearson <[email protected]>
>
> Thanks, I am planning to send the following patch for 6.4 consideration,
> please let me know if you disagree. Still testing it.
>
> ----8<---
>
> From: Joel Fernandes (Google) <[email protected]>
> Subject: [PATCH] RDMA/rxe: Rename kfree_rcu() to kvfree_rcu_mightsleep()
>
> The k[v]free_rcu() macro's single-argument form is deprecated.
> Therefore switch to the new k[v]free_rcu_mightsleep() variant. The goal
> is to avoid accidental use of the single-argument forms, which can
> introduce functionality bugs in atomic contexts and latency bugs in
> non-atomic contexts.
>
> There is no functionality change with this patch.
>
> Link: https://lore.kernel.org/rcu/[email protected]
> Acked-by: Zhu Yanjun <[email protected]>
> Reviewed-by: Bob Pearson <[email protected]>
> Reviewed-by: Paul E. McKenney <[email protected]>
> Fixes: 72a03627443d ("RDMA/rxe: Remove rxe_alloc()")
> Signed-off-by: Joel Fernandes (Google) <[email protected]>
> ---
> drivers/infiniband/sw/rxe/rxe_mr.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index b10aa1580a64..ae3a100e18fb 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -731,7 +731,7 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
> return -EINVAL;
>
> rxe_cleanup(mr);
> - kfree_rcu(mr);
> + kfree_rcu_mightsleep(mr);
> return 0;
> }
>
I would prefer just
- kfree_rcu(mr);
+ kfree(mr);
but either one will work.
Bob
On Wed, 1 Feb 2023 16:08:06 +0100
"Uladzislau Rezki (Sony)" <[email protected]> wrote:
> This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
> it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
>
> 1.
> The problem is that, recently we have run into a precedent when
> a user intended to give a second argument to kfree_rcu() API but
> forgot to do it in a code so a call became as a single argument
> of kfree_rcu() API.
>
> 2.
> Such mistyping can lead to hidden bags where sleeping is forbidden.
>
> 3.
> _mightsleep() prefix gives much more information for which contexts
> it can be used for.
My honest opinion is that I hate that name "kvfree_rcu_mightsleep()" ;-)
As I honestly don't know why it might sleep.
I didn't care about the name before, but now that it's touching code I
maintain I do care ;-)
Why not call it:
kvfree_rcu_synchronize()
?
As that is much more descriptive of what it does. Especially since these
ugly names are popping up in my code because kvfree_rcu() replaced a
rcu_synchronize() in the first place.
-- Steve
On 3/15/23 1:14?PM, Steven Rostedt wrote:
> On Wed, 1 Feb 2023 16:08:06 +0100
> "Uladzislau Rezki (Sony)" <[email protected]> wrote:
>
>> This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
>> it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
>>
>> 1.
>> The problem is that, recently we have run into a precedent when
>> a user intended to give a second argument to kfree_rcu() API but
>> forgot to do it in a code so a call became as a single argument
>> of kfree_rcu() API.
>>
>> 2.
>> Such mistyping can lead to hidden bags where sleeping is forbidden.
>>
>> 3.
>> _mightsleep() prefix gives much more information for which contexts
>> it can be used for.
>
> My honest opinion is that I hate that name "kvfree_rcu_mightsleep()" ;-)
>
> As I honestly don't know why it might sleep.
>
> I didn't care about the name before, but now that it's touching code I
> maintain I do care ;-)
>
> Why not call it:
>
> kvfree_rcu_synchronize()
>
> ?
>
> As that is much more descriptive of what it does. Especially since these
> ugly names are popping up in my code because kvfree_rcu() replaced a
> rcu_synchronize() in the first place.
This was my main complaint too, kvfree_rcu_mightsleep() is an absolutely
horrible name for an API... But nobody seemed to care about that!
I like the _synchronize() suggestion, as it matches other RCU naming.
--
Jens Axboe
On Wed, Mar 15, 2023 at 01:16:20PM -0600, Jens Axboe wrote:
> On 3/15/23 1:14?PM, Steven Rostedt wrote:
> > On Wed, 1 Feb 2023 16:08:06 +0100
> > "Uladzislau Rezki (Sony)" <[email protected]> wrote:
> >
> >> This small series is based on Paul's "dev" branch. Head is 6002817348a1c610dc1b1c01ff81654cdec12be4
> >> it renames a single argument of k[v]free_rcu() to its new k[v]free_rcu_mightsleep() name.
> >>
> >> 1.
> >> The problem is that, recently we have run into a precedent when
> >> a user intended to give a second argument to kfree_rcu() API but
> >> forgot to do it in a code so a call became as a single argument
> >> of kfree_rcu() API.
> >>
> >> 2.
> >> Such mistyping can lead to hidden bags where sleeping is forbidden.
> >>
> >> 3.
> >> _mightsleep() prefix gives much more information for which contexts
> >> it can be used for.
> >
> > My honest opinion is that I hate that name "kvfree_rcu_mightsleep()" ;-)
> >
> > As I honestly don't know why it might sleep.
> >
> > I didn't care about the name before, but now that it's touching code I
> > maintain I do care ;-)
> >
> > Why not call it:
> >
> > kvfree_rcu_synchronize()
> >
> > ?
> >
> > As that is much more descriptive of what it does. Especially since these
> > ugly names are popping up in my code because kvfree_rcu() replaced a
> > rcu_synchronize() in the first place.
>
> This was my main complaint too, kvfree_rcu_mightsleep() is an absolutely
> horrible name for an API... But nobody seemed to care about that!
>
> I like the _synchronize() suggestion, as it matches other RCU naming.
>
This is basically about what it does. If you renamed it to "_synchronize()"
in reality it would not mean that it always a synchronous call, most of the
time it is not whereas the name would point that it is.
--
Uladzislau Rezki
On Wed, 15 Mar 2023 20:25:10 +0100
Uladzislau Rezki <[email protected]> wrote:
> > This was my main complaint too, kvfree_rcu_mightsleep() is an absolutely
> > horrible name for an API... But nobody seemed to care about that!
> >
> > I like the _synchronize() suggestion, as it matches other RCU naming.
> >
> This is basically about what it does. If you renamed it to "_synchronize()"
> in reality it would not mean that it always a synchronous call, most of the
> time it is not whereas the name would point that it is.
No, just comment it.
I was going to suggest "kvfree_rcu_might_synchronize()" but that's just
getting ridiculous.
Still, I will replace that code back to a kfree() and rcu_synchonize() than
to let that other name get in.
-- Steve
Hey Steve,
On Wed, Mar 15, 2023 at 3:35 PM Steven Rostedt <[email protected]> wrote:
>
> On Wed, 15 Mar 2023 20:25:10 +0100
> Uladzislau Rezki <[email protected]> wrote:
>
> > > This was my main complaint too, kvfree_rcu_mightsleep() is an absolutely
> > > horrible name for an API... But nobody seemed to care about that!
> > >
> > > I like the _synchronize() suggestion, as it matches other RCU naming.
> > >
> > This is basically about what it does. If you renamed it to "_synchronize()"
> > in reality it would not mean that it always a synchronous call, most of the
> > time it is not whereas the name would point that it is.
>
> No, just comment it.
>
> I was going to suggest "kvfree_rcu_might_synchronize()" but that's just
> getting ridiculous.
No, synchronize() is incorrect. The code really can sleep for other
reasons like memory allocation. It is not that simple of an
implementation as one may imagine. mightsleep is really the correct
wording IMHO.
> Still, I will replace that code back to a kfree() and rcu_synchonize() than
> to let that other name get in.
I think it is too late for that for now, we already have conversions
going into the other subsystems, that means we'll have to redo all
that over again (even if it sounded like a good idea, which it is
not).
I would rather you just did: "#define kvfree_rcu_tracing
#kvfree_rcu_mightsleep", or something like that, if it is really a
problem. ;-)
Also you are really the first person I know of who has a problem with that name.
- Joel
On Wed, 15 Mar 2023 15:57:02 -0400
Joel Fernandes <[email protected]> wrote:
> > I was going to suggest "kvfree_rcu_might_synchronize()" but that's just
> > getting ridiculous.
>
> No, synchronize() is incorrect. The code really can sleep for other
> reasons like memory allocation. It is not that simple of an
> implementation as one may imagine. mightsleep is really the correct
> wording IMHO.
>
> > Still, I will replace that code back to a kfree() and rcu_synchonize() than
> > to let that other name get in.
>
> I think it is too late for that for now, we already have conversions
> going into the other subsystems, that means we'll have to redo all
> that over again (even if it sounded like a good idea, which it is
> not).
>
> I would rather you just did: "#define kvfree_rcu_tracing
> #kvfree_rcu_mightsleep", or something like that, if it is really a
> problem. ;-)
>
> Also you are really the first person I know of who has a problem with that name.
I guess you didn't read Jens's reply.
The main issue I have with this, is that "might_sleep" is just an
implementation issue. It has *nothing* to do with what the call is about.
It is only about freeing something with RCU. It has nothing to do with
sleeping. I don't use it because it might sleep. I use it to free something.
If you don't like kvfree_rcu_synchronization() then call it
kvfree_rcu_headless() and note that currently it can sleep. Because in
the future, if we come up with an implementation where we it doesn't sleep,
then we don't need to go and rename all the users in the future.
See where I have the problem with the name "might_sleep"?
-- Steve
On Wed, Mar 15, 2023 at 04:28:40PM -0400, Steven Rostedt wrote:
> On Wed, 15 Mar 2023 15:57:02 -0400
> Joel Fernandes <[email protected]> wrote:
>
> > > I was going to suggest "kvfree_rcu_might_synchronize()" but that's just
> > > getting ridiculous.
> >
> > No, synchronize() is incorrect. The code really can sleep for other
> > reasons like memory allocation. It is not that simple of an
> > implementation as one may imagine. mightsleep is really the correct
> > wording IMHO.
> >
> > > Still, I will replace that code back to a kfree() and rcu_synchonize() than
> > > to let that other name get in.
> >
> > I think it is too late for that for now, we already have conversions
> > going into the other subsystems, that means we'll have to redo all
> > that over again (even if it sounded like a good idea, which it is
> > not).
> >
> > I would rather you just did: "#define kvfree_rcu_tracing
> > #kvfree_rcu_mightsleep", or something like that, if it is really a
> > problem. ;-)
> >
> > Also you are really the first person I know of who has a problem with that name.
>
> I guess you didn't read Jens's reply.
>
> The main issue I have with this, is that "might_sleep" is just an
> implementation issue. It has *nothing* to do with what the call is about.
> It is only about freeing something with RCU. It has nothing to do with
> sleeping. I don't use it because it might sleep. I use it to free something.
>
> If you don't like kvfree_rcu_synchronization() then call it
> kvfree_rcu_headless() and note that currently it can sleep. Because in
> the future, if we come up with an implementation where we it doesn't sleep,
> then we don't need to go and rename all the users in the future.
>
> See where I have the problem with the name "might_sleep"?
>
In that sense there is no need in renaming it. The current name of
single argument is kvfree_rcu(ptr). It is documented that it can sleep.
According to its name it is clear that it is headless since there
is no a second argument.
--
Uladzislau Rezki
On Wed, Mar 15, 2023 at 10:07:22PM +0100, Uladzislau Rezki wrote:
> On Wed, Mar 15, 2023 at 04:28:40PM -0400, Steven Rostedt wrote:
> > On Wed, 15 Mar 2023 15:57:02 -0400
> > Joel Fernandes <[email protected]> wrote:
> >
> > > > I was going to suggest "kvfree_rcu_might_synchronize()" but that's just
> > > > getting ridiculous.
> > >
> > > No, synchronize() is incorrect. The code really can sleep for other
> > > reasons like memory allocation. It is not that simple of an
> > > implementation as one may imagine. mightsleep is really the correct
> > > wording IMHO.
> > >
> > > > Still, I will replace that code back to a kfree() and rcu_synchonize() than
> > > > to let that other name get in.
> > >
> > > I think it is too late for that for now, we already have conversions
> > > going into the other subsystems, that means we'll have to redo all
> > > that over again (even if it sounded like a good idea, which it is
> > > not).
> > >
> > > I would rather you just did: "#define kvfree_rcu_tracing
> > > #kvfree_rcu_mightsleep", or something like that, if it is really a
> > > problem. ;-)
> > >
> > > Also you are really the first person I know of who has a problem with that name.
> >
> > I guess you didn't read Jens's reply.
> >
> > The main issue I have with this, is that "might_sleep" is just an
> > implementation issue. It has *nothing* to do with what the call is about.
> > It is only about freeing something with RCU. It has nothing to do with
> > sleeping. I don't use it because it might sleep. I use it to free something.
> >
> > If you don't like kvfree_rcu_synchronization() then call it
> > kvfree_rcu_headless() and note that currently it can sleep. Because in
> > the future, if we come up with an implementation where we it doesn't sleep,
> > then we don't need to go and rename all the users in the future.
> >
> > See where I have the problem with the name "might_sleep"?
> >
> In that sense there is no need in renaming it. The current name of
> single argument is kvfree_rcu(ptr). It is documented that it can sleep.
>
> According to its name it is clear that it is headless since there
> is no a second argument.
>
Forgot to add. I agree with you that currently it can sleep but it
does not mean that a future stays the same, thus there might be an
extra need in renaming again.
--
Uladzislau Rezki
Hey Steve,
On Wed, Mar 15, 2023 at 4:28 PM Steven Rostedt <[email protected]> wrote:
>
> On Wed, 15 Mar 2023 15:57:02 -0400
> Joel Fernandes <[email protected]> wrote:
>
> > > I was going to suggest "kvfree_rcu_might_synchronize()" but that's just
> > > getting ridiculous.
> >
> > No, synchronize() is incorrect. The code really can sleep for other
> > reasons like memory allocation. It is not that simple of an
> > implementation as one may imagine. mightsleep is really the correct
> > wording IMHO.
> >
> > > Still, I will replace that code back to a kfree() and rcu_synchonize() than
> > > to let that other name get in.
> >
> > I think it is too late for that for now, we already have conversions
> > going into the other subsystems, that means we'll have to redo all
> > that over again (even if it sounded like a good idea, which it is
> > not).
> >
> > I would rather you just did: "#define kvfree_rcu_tracing
> > #kvfree_rcu_mightsleep", or something like that, if it is really a
> > problem. ;-)
> >
> > Also you are really the first person I know of who has a problem with that name.
>
> I guess you didn't read Jens's reply.
Apologies, I am trying to keep up with email but this week is crazy.
> The main issue I have with this, is that "might_sleep" is just an
> implementation issue. It has *nothing* to do with what the call is about.
> It is only about freeing something with RCU. It has nothing to do with
> sleeping. I don't use it because it might sleep. I use it to free something.
>
> If you don't like kvfree_rcu_synchronization() then call it
> kvfree_rcu_headless() and note that currently it can sleep. Because in
> the future, if we come up with an implementation where we it doesn't sleep,
> then we don't need to go and rename all the users in the future.
>
> See where I have the problem with the name "might_sleep"?
I am doubtful there may be a future where it does not sleep. Why?
Because you need an rcu_head *somewhere*. Unlike with debubojects,
which involves a lock-free per-CPU pool and a locked global pool, and
has the liberty to shutdown if it runs out of objects -- in RCU code
it doesn't have that liberty and it has to just keep working. The
kfree_rcu code does have pools of rcu_head as well, but that is not
thought to be enough to prevent OOM when memory needs to be given
back. AFAIK -- the synchronize_rcu() in there is a last resort and
undesirable (supposed to happen only when running out of
objects/memory).
Also "mightsleep" means just that -- *might*. That covers the fact
that sleeping may not happen ;-).
This is just my opinion and I will defer to Uladzislau, Paul and you
on how to proceed. Another option is "cansleep" which has the same
number of characters as headless. I don't believe expecting users to
read comments is practical, since we did already have comments and
there was a bug in the usage that triggered this whole series.
thanks,
- Joel
On Wed, 15 Mar 2023 18:08:19 -0400
Joel Fernandes <[email protected]> wrote:
> I am doubtful there may be a future where it does not sleep. Why?
> Because you need an rcu_head *somewhere*. Unlike with debubojects,
> which involves a lock-free per-CPU pool and a locked global pool, and
> has the liberty to shutdown if it runs out of objects -- in RCU code
> it doesn't have that liberty and it has to just keep working. The
> kfree_rcu code does have pools of rcu_head as well, but that is not
> thought to be enough to prevent OOM when memory needs to be given
> back. AFAIK -- the synchronize_rcu() in there is a last resort and
> undesirable (supposed to happen only when running out of
> objects/memory).
And everything you said above is still implementation, and the user of
kvfree_rcu() doesn't care.
The only thing different about the two cases is that one is headless.
>
> Also "mightsleep" means just that -- *might*. That covers the fact
> that sleeping may not happen ;-).
Yes, and even though you are doubtful of it not ever having a non-sleep
implementation, there is still a chance that there might be something
someday.
>
> This is just my opinion and I will defer to Uladzislau, Paul and you
> on how to proceed. Another option is "cansleep" which has the same
> number of characters as headless. I don't believe expecting users to
> read comments is practical, since we did already have comments and
> there was a bug in the usage that triggered this whole series.
The point of "headless" is that is the rational for this version of
kvfree_rcu(). It doesn't have a head. That's an API name that users care
about.
Why not call it kvfree_rcu_alloc() ? It allocates right?
We have might_sleep() in lots of places. In fact, the default is things
might sleep. We don't need to call it out. That's what the might_sleep()
call is for. Usually it's the non sleep version that is special.
We could call the normal kvfree_rcu() "kvfree_rcu_inatomic()" ;-)
But I guess that would be a bigger change.
-- Steve
On Thu, 9 Mar 2023 14:45:21 +0100
Uladzislau Rezki <[email protected]> wrote:
> > The kvfree_rcu()'s single argument name is deprecated therefore
> > rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> > underline that it is for sleepable contexts.
> >
> > Cc: Steven Rostedt (VMware) <[email protected]>
> > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> >
> Could you please add you reviwed-by or Acked-by tags so we can bring
> our series with renaming for the next merge window?
I don't know. Perhaps we should just apply this patch and not worry about
sleeping and whatnot.
-- Steve
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 04f0fdae19a1..5de945a8f61d 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS;
struct osnoise_instance {
struct list_head list;
struct trace_array *tr;
+ struct rcu_head rcu;
};
static struct list_head osnoise_instances;
@@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr)
if (!found)
return;
- kvfree_rcu(inst);
+ kvfree_rcu(inst, rcu);
}
/*
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 20d0c4a97633..ef5fafb40c76 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp,
return -ENOENT;
list_del_rcu(&link->list);
- kvfree_rcu(link);
+ kvfree_rcu(link, rcu);
if (list_empty(&tp->event->files))
trace_probe_clear_flag(tp, TP_FLAG_TRACE);
diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
index ef8ed3b65d05..e6037752dcf0 100644
--- a/kernel/trace/trace_probe.h
+++ b/kernel/trace/trace_probe.h
@@ -256,6 +256,7 @@ struct trace_probe {
struct event_file_link {
struct trace_event_file *file;
struct list_head list;
+ struct rcu_head rcu;
};
static inline bool trace_probe_test_flag(struct trace_probe *tp,
On 3/15/23 4:36 PM, Steven Rostedt wrote:
> On Thu, 9 Mar 2023 14:45:21 +0100
> Uladzislau Rezki <[email protected]> wrote:
>
>>> The kvfree_rcu()'s single argument name is deprecated therefore
>>> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
>>> underline that it is for sleepable contexts.
>>>
>>> Cc: Steven Rostedt (VMware) <[email protected]>
>>> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
>>>
>> Could you please add you reviwed-by or Acked-by tags so we can bring
>> our series with renaming for the next merge window?
>
> I don't know. Perhaps we should just apply this patch and not worry about
> sleeping and whatnot.
That's a cop out, just removing the one case you care about. Fact is
the naming is awful, and the 1/2 argument thing is making it worse.
If a big change is warranted, why not do it right and ACTUALLY
get it right?
--
Jens Axboe
On Wed, Mar 15, 2023 at 05:19:18PM -0600, Jens Axboe wrote:
> On 3/15/23 4:36 PM, Steven Rostedt wrote:
> > On Thu, 9 Mar 2023 14:45:21 +0100
> > Uladzislau Rezki <[email protected]> wrote:
> >
> >>> The kvfree_rcu()'s single argument name is deprecated therefore
> >>> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> >>> underline that it is for sleepable contexts.
> >>>
> >>> Cc: Steven Rostedt (VMware) <[email protected]>
> >>> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> >>>
> >> Could you please add you reviwed-by or Acked-by tags so we can bring
> >> our series with renaming for the next merge window?
> >
> > I don't know. Perhaps we should just apply this patch and not worry about
> > sleeping and whatnot.
That does work, and I am guessing that the size increase is not a big
problem for you there.
> That's a cop out, just removing the one case you care about. Fact is
> the naming is awful, and the 1/2 argument thing is making it worse.
> If a big change is warranted, why not do it right and ACTUALLY
> get it right?
You both do realize that the kvfree_rcu_mightsleep() definition is
already in mainline, right?
Anyway, to sum up, kvfree_rcu_mightsleep()--or whatever the entire
community eventually decides to name it--can do any of the following:
1. Put the pointer into an already allocated array of pointers.
2. Allocate a new array of pointers, have the allocation succeed
without sleeping, then put the pointer into an already allocated
array of pointers.
3. Allocate a new array of pointers, have the allocation succeed
after sleeping, then put the pointer into an already allocated
array of pointers.
4. Attempt to allocate a new array of pointers, have the allocation
fail (presumably after sleeping), then invoke synchronize_rcu()
directly.
Too much fun! ;-)
Thanx, Paul
On Wed, Mar 15, 2023 at 03:34:48PM -0400, Steven Rostedt wrote:
>
> Still, I will replace that code back to a kfree() and rcu_synchonize() than
> to let that other name get in.
That will have a performance hit relaive to kfree_rcu_mightsleep().
If that's OK with you, sure, you can do that.
Personally, I don't have a lot of problem with that name, which is why
I ack'ed the change for ext4.
- Ted
On Wed, Mar 15, 2023 at 06:08:19PM -0400, Joel Fernandes wrote:
>
> I am doubtful there may be a future where it does not sleep. Why?
> Because you need an rcu_head *somewhere*.
I think the real problem was that this won't sleep:
kfree_rcu(ptr, rhf);
While this *could* sleep:
kfree_rcu(ptr);
So the the original sin was to try to make the same mistake that C++
did --- which is to think that it's good to have functions that have
the same name but different function signatures, and in some cases,
different semantic meanings because they have different implementations.
Personally, this is why I refuse to use C++ for any of my personal
projects --- this kind of "magic" looks good, but it's a great way to
potentially shoot yourself (or worse, your users) in the foot.
So separating out the two-argument kfree_rcu() from the one-argument
kfree_rcu(), by renaming the latter to something else is IMHO, a
Really F***** Good Idea. So while, sure, kfree_rcu_mightsleep() might
be a little awkward, the name documents the potential landmind
involved with using that function, that's a good thing. Because do
you really think users will always conscientiously check the
documentation and/or the implementation before using the interface? :-)
If you hate that name, one other possibility is to try to use the
two-argument form kfree_rcu() and arrange to *have* a rcu_head in the
structure. That's going to be better from a performance perspective,
and thus kinder to the end user than using rcu_synchronize().
Cheers,
- Ted
Hey Steve,
On Wed, Mar 15, 2023 at 6:26 PM Steven Rostedt <[email protected]> wrote:
>
[...]
> > Also "mightsleep" means just that -- *might*. That covers the fact
> > that sleeping may not happen ;-).
>
> Yes, and even though you are doubtful of it not ever having a non-sleep
> implementation, there is still a chance that there might be something
> someday.
Perhaps if it never sleeps, then we would introduce back the
single-arg kvfree_rcu() and delete the kvfree_rcu_mightsleep()` at
that point, since it would not serve any purpose.
> > This is just my opinion and I will defer to Uladzislau, Paul and you
> > on how to proceed. Another option is "cansleep" which has the same
> > number of characters as headless. I don't believe expecting users to
> > read comments is practical, since we did already have comments and
> > there was a bug in the usage that triggered this whole series.
>
> The point of "headless" is that is the rational for this version of
> kvfree_rcu(). It doesn't have a head. That's an API name that users care
> about.
>
> Why not call it kvfree_rcu_alloc() ? It allocates right?
Sure, but one can say now that allocating is an implementation detail? ;-)
Also, it may sound strange to have 'free' and 'alloc' in the same name.
> We have might_sleep() in lots of places. In fact, the default is things
> might sleep. We don't need to call it out. That's what the might_sleep()
> call is for. Usually it's the non sleep version that is special.
>
> We could call the normal kvfree_rcu() "kvfree_rcu_inatomic()" ;-)
Heh, I actually like 'inatomic' alot ;-)
> But I guess that would be a bigger change.
>
True.
thanks,
- Joel
On Wed, 15 Mar 2023 21:25:16 -0400
"Theodore Ts'o" <[email protected]> wrote:
> On Wed, Mar 15, 2023 at 06:08:19PM -0400, Joel Fernandes wrote:
> >
> > I am doubtful there may be a future where it does not sleep. Why?
> > Because you need an rcu_head *somewhere*.
>
> I think the real problem was that this won't sleep:
>
> kfree_rcu(ptr, rhf);
>
> While this *could* sleep:
>
> kfree_rcu(ptr);
>
> So the the original sin was to try to make the same mistake that C++
> did --- which is to think that it's good to have functions that have
> the same name but different function signatures, and in some cases,
> different semantic meanings because they have different implementations.
>
> Personally, this is why I refuse to use C++ for any of my personal
> projects --- this kind of "magic" looks good, but it's a great way to
> potentially shoot yourself (or worse, your users) in the foot.
>
> So separating out the two-argument kfree_rcu() from the one-argument
> kfree_rcu(), by renaming the latter to something else is IMHO, a
> Really F***** Good Idea. So while, sure, kfree_rcu_mightsleep() might
> be a little awkward, the name documents the potential landmind
> involved with using that function, that's a good thing. Because do
> you really think users will always conscientiously check the
> documentation and/or the implementation before using the interface? :-)
I agree with everything you said above, and feel that having the same name
with two different semantics was not a good way to go. Not to mention, I
avoid C++ for basically the same reasons (plus others).
>
> If you hate that name, one other possibility is to try to use the
> two-argument form kfree_rcu() and arrange to *have* a rcu_head in the
> structure. That's going to be better from a performance perspective,
> and thus kinder to the end user than using rcu_synchronize().
Which is the what I last suggested doing.
https://lore.kernel.org/all/[email protected]/
-- Steve
On Wed, 15 Mar 2023 17:37:30 -0700
"Paul E. McKenney" <[email protected]> wrote:
>
> That does work, and I am guessing that the size increase is not a big
> problem for you there.
Well, I was fine with it as long as it stayed in the headers, where
ugliness is warmly welcomed. Just ask all the #ifdefs.
>
> > That's a cop out, just removing the one case you care about. Fact is
> > the naming is awful, and the 1/2 argument thing is making it worse.
> > If a big change is warranted, why not do it right and ACTUALLY
> > get it right?
>
> You both do realize that the kvfree_rcu_mightsleep() definition is
> already in mainline, right?
>
> Anyway, to sum up, kvfree_rcu_mightsleep()--or whatever the entire
> community eventually decides to name it--can do any of the following:
>
> 1. Put the pointer into an already allocated array of pointers.
>
> 2. Allocate a new array of pointers, have the allocation succeed
> without sleeping, then put the pointer into an already allocated
> array of pointers.
>
> 3. Allocate a new array of pointers, have the allocation succeed
> after sleeping, then put the pointer into an already allocated
> array of pointers.
>
> 4. Attempt to allocate a new array of pointers, have the allocation
> fail (presumably after sleeping), then invoke synchronize_rcu()
> directly.
>
> Too much fun! ;-)
>
kvfree_rcu_kitchen_sink() ?
kvfree_rcu_goldie_locks()?
I honestly like the name "headless" as that perfectly describes the
difference between kvfree_rcu(arg1, arg2) and kvfree_rcu(arg1).
Whereas mightsleep() is confusing to me because it doesn't tell me why
kvfree_rcu() has two args and kvfree_rcu_mightsleep() has only one.
Usually, code that has two sleep variants is about limiting the
functionality of the atomic friendly one.
-- Steve
On Wed, 15 Mar 2023 22:13:26 -0400
Joel Fernandes <[email protected]> wrote:
> > Why not call it kvfree_rcu_alloc() ? It allocates right?
>
> Sure, but one can say now that allocating is an implementation detail? ;-)
I was being sarcastic.
But as the mighty Linus once said: "In the Internet, nobody can hear your sarcasm"
-- Steve
On Wed, Mar 15, 2023 at 09:25:16PM -0400, Theodore Ts'o wrote:
> On Wed, Mar 15, 2023 at 06:08:19PM -0400, Joel Fernandes wrote:
> >
> > I am doubtful there may be a future where it does not sleep. Why?
> > Because you need an rcu_head *somewhere*.
>
> I think the real problem was that this won't sleep:
>
> kfree_rcu(ptr, rhf);
>
> While this *could* sleep:
>
> kfree_rcu(ptr);
>
> So the the original sin was to try to make the same mistake that C++
> did --- which is to think that it's good to have functions that have
> the same name but different function signatures, and in some cases,
> different semantic meanings because they have different implementations.
Guilty to charges as read. ;-)
> Personally, this is why I refuse to use C++ for any of my personal
> projects --- this kind of "magic" looks good, but it's a great way to
> potentially shoot yourself (or worse, your users) in the foot.
>
> So separating out the two-argument kfree_rcu() from the one-argument
> kfree_rcu(), by renaming the latter to something else is IMHO, a
> Really F***** Good Idea. So while, sure, kfree_rcu_mightsleep() might
> be a little awkward, the name documents the potential landmind
> involved with using that function, that's a good thing. Because do
> you really think users will always conscientiously check the
> documentation and/or the implementation before using the interface? :-)
>
> If you hate that name, one other possibility is to try to use the
> two-argument form kfree_rcu() and arrange to *have* a rcu_head in the
> structure. That's going to be better from a performance perspective,
> and thus kinder to the end user than using rcu_synchronize().
The original reason for single-argument kvfree_rcu() was to avoid
the need for that rcu_head. The use case was a small data structure
with an extremely high population.
Thanx, Paul
On Wed, Mar 15, 2023 at 10:23:23PM -0400, Steven Rostedt wrote:
> On Wed, 15 Mar 2023 17:37:30 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> >
> > That does work, and I am guessing that the size increase is not a big
> > problem for you there.
>
> Well, I was fine with it as long as it stayed in the headers, where
> ugliness is warmly welcomed. Just ask all the #ifdefs.
>
> >
> > > That's a cop out, just removing the one case you care about. Fact is
> > > the naming is awful, and the 1/2 argument thing is making it worse.
> > > If a big change is warranted, why not do it right and ACTUALLY
> > > get it right?
> >
> > You both do realize that the kvfree_rcu_mightsleep() definition is
> > already in mainline, right?
> >
> > Anyway, to sum up, kvfree_rcu_mightsleep()--or whatever the entire
> > community eventually decides to name it--can do any of the following:
> >
> > 1. Put the pointer into an already allocated array of pointers.
> >
> > 2. Allocate a new array of pointers, have the allocation succeed
> > without sleeping, then put the pointer into an already allocated
> > array of pointers.
> >
> > 3. Allocate a new array of pointers, have the allocation succeed
> > after sleeping, then put the pointer into an already allocated
> > array of pointers.
> >
> > 4. Attempt to allocate a new array of pointers, have the allocation
> > fail (presumably after sleeping), then invoke synchronize_rcu()
> > directly.
> >
> > Too much fun! ;-)
> >
>
> kvfree_rcu_kitchen_sink() ?
>
> kvfree_rcu_goldie_locks()?
>
> I honestly like the name "headless" as that perfectly describes the
> difference between kvfree_rcu(arg1, arg2) and kvfree_rcu(arg1).
>
> Whereas mightsleep() is confusing to me because it doesn't tell me why
> kvfree_rcu() has two args and kvfree_rcu_mightsleep() has only one.
> Usually, code that has two sleep variants is about limiting the
> functionality of the atomic friendly one.
kvfree_rcu_alloc_head()?
kvfree_rcu_dynhead()?
kvfree_rcu_gearhead()?
kvfree_rcu_radiohead()?
kvfree_rcu_getahead()?
I don't know about you guys, but to me, kvfree_rcu_mightsleep() is
sounding better and better by comparison...
Thanx, Paul
On Wed, Mar 15, 2023 at 11:44 PM Paul E. McKenney <[email protected]> wrote:
>
> On Wed, Mar 15, 2023 at 10:23:23PM -0400, Steven Rostedt wrote:
> > On Wed, 15 Mar 2023 17:37:30 -0700
> > "Paul E. McKenney" <[email protected]> wrote:
> >
> > >
> > > That does work, and I am guessing that the size increase is not a big
> > > problem for you there.
> >
> > Well, I was fine with it as long as it stayed in the headers, where
> > ugliness is warmly welcomed. Just ask all the #ifdefs.
> >
> > >
> > > > That's a cop out, just removing the one case you care about. Fact is
> > > > the naming is awful, and the 1/2 argument thing is making it worse.
> > > > If a big change is warranted, why not do it right and ACTUALLY
> > > > get it right?
> > >
> > > You both do realize that the kvfree_rcu_mightsleep() definition is
> > > already in mainline, right?
> > >
> > > Anyway, to sum up, kvfree_rcu_mightsleep()--or whatever the entire
> > > community eventually decides to name it--can do any of the following:
> > >
> > > 1. Put the pointer into an already allocated array of pointers.
> > >
> > > 2. Allocate a new array of pointers, have the allocation succeed
> > > without sleeping, then put the pointer into an already allocated
> > > array of pointers.
> > >
> > > 3. Allocate a new array of pointers, have the allocation succeed
> > > after sleeping, then put the pointer into an already allocated
> > > array of pointers.
> > >
> > > 4. Attempt to allocate a new array of pointers, have the allocation
> > > fail (presumably after sleeping), then invoke synchronize_rcu()
> > > directly.
> > >
> > > Too much fun! ;-)
> > >
> >
> > kvfree_rcu_kitchen_sink() ?
> >
> > kvfree_rcu_goldie_locks()?
> >
> > I honestly like the name "headless" as that perfectly describes the
> > difference between kvfree_rcu(arg1, arg2) and kvfree_rcu(arg1).
> >
> > Whereas mightsleep() is confusing to me because it doesn't tell me why
> > kvfree_rcu() has two args and kvfree_rcu_mightsleep() has only one.
> > Usually, code that has two sleep variants is about limiting the
> > functionality of the atomic friendly one.
>
> kvfree_rcu_alloc_head()?
> kvfree_rcu_dynhead()?
> kvfree_rcu_gearhead()?
> kvfree_rcu_radiohead()?
> kvfree_rcu_getahead()?
>
> I don't know about you guys, but to me, kvfree_rcu_mightsleep() is
> sounding better and better by comparison...
Indeed, and one could argue that "headless" sounds like something out
of a horror movie ;-). Which of course does match the situation when
the API is applied incorrectly.
- Joel
On Wed, Mar 15, 2023 at 10:50 PM Steven Rostedt <[email protected]> wrote:
>
> On Wed, 15 Mar 2023 22:13:26 -0400
> Joel Fernandes <[email protected]> wrote:
>
> > > Why not call it kvfree_rcu_alloc() ? It allocates right?
> >
> > Sure, but one can say now that allocating is an implementation detail? ;-)
>
> I was being sarcastic.
>
> But as the mighty Linus once said: "In the Internet, nobody can hear your sarcasm"
>
I guess it's official then - the internet has killed sarcasm. I
suppose we'll all have to resort to using more obvious forms of humor,
like dad jokes. ;)
- Joel
> On Thu, 9 Mar 2023 14:45:21 +0100
> Uladzislau Rezki <[email protected]> wrote:
>
> > > The kvfree_rcu()'s single argument name is deprecated therefore
> > > rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> > > underline that it is for sleepable contexts.
> > >
> > > Cc: Steven Rostedt (VMware) <[email protected]>
> > > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> > >
> > Could you please add you reviwed-by or Acked-by tags so we can bring
> > our series with renaming for the next merge window?
>
> I don't know. Perhaps we should just apply this patch and not worry about
> sleeping and whatnot.
>
> -- Steve
>
> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
> index 04f0fdae19a1..5de945a8f61d 100644
> --- a/kernel/trace/trace_osnoise.c
> +++ b/kernel/trace/trace_osnoise.c
> @@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS;
> struct osnoise_instance {
> struct list_head list;
> struct trace_array *tr;
> + struct rcu_head rcu;
> };
>
> static struct list_head osnoise_instances;
> @@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr)
> if (!found)
> return;
>
> - kvfree_rcu(inst);
> + kvfree_rcu(inst, rcu);
> }
>
> /*
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 20d0c4a97633..ef5fafb40c76 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp,
> return -ENOENT;
>
> list_del_rcu(&link->list);
> - kvfree_rcu(link);
> + kvfree_rcu(link, rcu);
>
> if (list_empty(&tp->event->files))
> trace_probe_clear_flag(tp, TP_FLAG_TRACE);
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index ef8ed3b65d05..e6037752dcf0 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -256,6 +256,7 @@ struct trace_probe {
> struct event_file_link {
> struct trace_event_file *file;
> struct list_head list;
> + struct rcu_head rcu;
> };
>
> static inline bool trace_probe_test_flag(struct trace_probe *tp,
>
struct foo_a {
int a;
int b;
};
your obj size is 8 byte
struct foo_b {
struct rcu_head rcu;
int a;
int b;
};
now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object
will be 32 bytes since there is no slab for 24 bytes:
<snip>
kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0
kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0
kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0
<snip>
if we allocate 512 objects of foo_a it would be 4096 bytes
in case of foo_b it is 24 * 512 = 12228 bytes.
single argument will give you 4096 + 512 * 8 = 8192 bytes
int terms of memory consumtion.
And double argument will not give you better performance comparing
with a single argument.
So it depends on what you want to achieve by that patch.
--
Uladzislau Rezki
On Thu, 16 Mar 2023 00:16:39 -0400
Joel Fernandes <[email protected]> wrote:
> Indeed, and one could argue that "headless" sounds like something out
> of a horror movie ;-). Which of course does match the situation when
> the API is applied incorrectly.
Well, "headless" is a common term in IT.
https://en.wikipedia.org/wiki/Headless_software
We could be specific to what horror movie/story, and call it:
kvfree_rcu_sleepy_hollow()
Which will imply both headless *and* might_sleep!
-- Steve
On Thu, 16 Mar 2023 09:16:37 +0100
Uladzislau Rezki <[email protected]> wrote:
> > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> > index ef8ed3b65d05..e6037752dcf0 100644
> > --- a/kernel/trace/trace_probe.h
> > +++ b/kernel/trace/trace_probe.h
> > @@ -256,6 +256,7 @@ struct trace_probe {
> > struct event_file_link {
> > struct trace_event_file *file;
> > struct list_head list;
> > + struct rcu_head rcu;
> > };
> >
> > static inline bool trace_probe_test_flag(struct trace_probe *tp,
> >
> struct foo_a {
> int a;
> int b;
> };
Most machines today are 64 bits, even low end machines.
struct foo_a {
long long a;
long long b;
};
is more accurate. That's 16 bytes.
Although it is more likely off because list_head is a double pointer. But
let's just go with this, as the amount really doesn't matter here.
>
> your obj size is 8 byte
>
> struct foo_b {
> struct rcu_head rcu;
Isn't rcu_head defined as;
struct callback_head {
struct callback_head *next;
void (*func)(struct callback_head *head);
} __attribute__((aligned(sizeof(void *))));
#define rcu_head callback_head
Which makes it 8 not 16 on 32 bit as well?
> int a;
> int b;
> };
So it should be 8 + 8 = 16, on 32 bit and 16 + 16 = 32 on 64bit.
>
> now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object
> will be 32 bytes since there is no slab for 24 bytes:
>
> <snip>
> kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0
> kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0
> kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0
> <snip>
>
> if we allocate 512 objects of foo_a it would be 4096 bytes
> in case of foo_b it is 24 * 512 = 12228 bytes.
This is for probe events. We usually allocate 1, maybe 2. Oh, some may even
allocate 100 to be crazy. But each probe event is in reality much larger
(1K perhaps) as each one allocates dentry's, inodes, etc. So 8 or 16 bytes
extra is still lost in the noise.
>
> single argument will give you 4096 + 512 * 8 = 8192 bytes
> int terms of memory consumtion.
If someone allocate 512 instances, that would be closer to a meg in size
without this change. 8k is probably less than 1%
>
> And double argument will not give you better performance comparing
> with a single argument.
It will, because it will no longer have to allocate anything if need be.
Note, when it doesn't allocate the system is probably mostly idle and we
don't care about performance, but when it needs allocation, that's likely a
time when performance is a bit more important.
-- Steve
On Thu, Mar 16, 2023 at 08:14:24AM -0400, Steven Rostedt wrote:
> On Thu, 16 Mar 2023 00:16:39 -0400
> Joel Fernandes <[email protected]> wrote:
>
> > Indeed, and one could argue that "headless" sounds like something out
> > of a horror movie ;-). Which of course does match the situation when
> > the API is applied incorrectly.
>
> Well, "headless" is a common term in IT.
>
> https://en.wikipedia.org/wiki/Headless_software
Indeed it is. But RCU is incapable of headlessness, so in the
kvfree_rcu_mightsleep() case, the rcu_head is dynamically allocated.
> We could be specific to what horror movie/story, and call it:
>
> kvfree_rcu_sleepy_hollow()
>
> Which will imply both headless *and* might_sleep!
Heh! That one is almost bad enough to be good! Almost! ;-)
Thanx, Paul
On Thu, Mar 16, 2023 at 09:56:53AM -0400, Steven Rostedt wrote:
> On Thu, 16 Mar 2023 09:16:37 +0100
> Uladzislau Rezki <[email protected]> wrote:
>
> > > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> > > index ef8ed3b65d05..e6037752dcf0 100644
> > > --- a/kernel/trace/trace_probe.h
> > > +++ b/kernel/trace/trace_probe.h
> > > @@ -256,6 +256,7 @@ struct trace_probe {
> > > struct event_file_link {
> > > struct trace_event_file *file;
> > > struct list_head list;
> > > + struct rcu_head rcu;
> > > };
> > >
> > > static inline bool trace_probe_test_flag(struct trace_probe *tp,
> > >
> > struct foo_a {
> > int a;
> > int b;
> > };
>
> Most machines today are 64 bits, even low end machines.
>
> struct foo_a {
> long long a;
> long long b;
> };
>
> is more accurate. That's 16 bytes.
>
> Although it is more likely off because list_head is a double pointer. But
> let's just go with this, as the amount really doesn't matter here.
>
> >
> > your obj size is 8 byte
> >
> > struct foo_b {
> > struct rcu_head rcu;
>
> Isn't rcu_head defined as;
>
> struct callback_head {
> struct callback_head *next;
> void (*func)(struct callback_head *head);
> } __attribute__((aligned(sizeof(void *))));
> #define rcu_head callback_head
>
> Which makes it 8 not 16 on 32 bit as well?
>
> > int a;
> > int b;
> > };
>
> So it should be 8 + 8 = 16, on 32 bit and 16 + 16 = 32 on 64bit.
>
> >
> > now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object
> > will be 32 bytes since there is no slab for 24 bytes:
> >
> > <snip>
> > kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0
> > kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0
> > kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0
> > <snip>
> >
> > if we allocate 512 objects of foo_a it would be 4096 bytes
> > in case of foo_b it is 24 * 512 = 12228 bytes.
>
> This is for probe events. We usually allocate 1, maybe 2. Oh, some may even
> allocate 100 to be crazy. But each probe event is in reality much larger
> (1K perhaps) as each one allocates dentry's, inodes, etc. So 8 or 16 bytes
> extra is still lost in the noise.
>
> >
> > single argument will give you 4096 + 512 * 8 = 8192 bytes
> > int terms of memory consumtion.
>
> If someone allocate 512 instances, that would be closer to a meg in size
> without this change. 8k is probably less than 1%
>
In percentage. My case. (12228 - 8192) * 100 / 12228 = ~33% difference.
> >
> > And double argument will not give you better performance comparing
> > with a single argument.
>
> It will, because it will no longer have to allocate anything if need be.
> Note, when it doesn't allocate the system is probably mostly idle and we
> don't care about performance, but when it needs allocation, that's likely a
> time when performance is a bit more important.
>
The problem further is about pointer chasing, like comparing arrays and
lists. It will take longer time to offload all pointers.
--
Uladzislau Rezki
On Thu, Mar 16, 2023 at 09:56:53AM -0400, Steven Rostedt wrote:
> On Thu, 16 Mar 2023 09:16:37 +0100
> Uladzislau Rezki <[email protected]> wrote:
>
> > > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> > > index ef8ed3b65d05..e6037752dcf0 100644
> > > --- a/kernel/trace/trace_probe.h
> > > +++ b/kernel/trace/trace_probe.h
> > > @@ -256,6 +256,7 @@ struct trace_probe {
> > > struct event_file_link {
> > > struct trace_event_file *file;
> > > struct list_head list;
> > > + struct rcu_head rcu;
> > > };
> > >
> > > static inline bool trace_probe_test_flag(struct trace_probe *tp,
> > >
> > struct foo_a {
> > int a;
> > int b;
> > };
>
> Most machines today are 64 bits, even low end machines.
>
> struct foo_a {
> long long a;
> long long b;
> };
>
> is more accurate. That's 16 bytes.
>
> Although it is more likely off because list_head is a double pointer. But
> let's just go with this, as the amount really doesn't matter here.
>
> >
> > your obj size is 8 byte
> >
> > struct foo_b {
> > struct rcu_head rcu;
>
> Isn't rcu_head defined as;
>
> struct callback_head {
> struct callback_head *next;
> void (*func)(struct callback_head *head);
> } __attribute__((aligned(sizeof(void *))));
> #define rcu_head callback_head
>
> Which makes it 8 not 16 on 32 bit as well?
>
> > int a;
> > int b;
> > };
>
> So it should be 8 + 8 = 16, on 32 bit and 16 + 16 = 32 on 64bit.
>
> >
> > now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object
> > will be 32 bytes since there is no slab for 24 bytes:
> >
> > <snip>
> > kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0
> > kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0
> > kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0
> > <snip>
> >
> > if we allocate 512 objects of foo_a it would be 4096 bytes
> > in case of foo_b it is 24 * 512 = 12228 bytes.
>
> This is for probe events. We usually allocate 1, maybe 2. Oh, some may even
> allocate 100 to be crazy. But each probe event is in reality much larger
> (1K perhaps) as each one allocates dentry's, inodes, etc. So 8 or 16 bytes
> extra is still lost in the noise.
Agreed, in this case, there is almost no penalty for adding an rcu_head
structure to the probe event...
> > single argument will give you 4096 + 512 * 8 = 8192 bytes
> > int terms of memory consumtion.
>
> If someone allocate 512 instances, that would be closer to a meg in size
> without this change. 8k is probably less than 1%
>
> >
> > And double argument will not give you better performance comparing
> > with a single argument.
>
> It will, because it will no longer have to allocate anything if need be.
> Note, when it doesn't allocate the system is probably mostly idle and we
> don't care about performance, but when it needs allocation, that's likely a
> time when performance is a bit more important.
... and also agreed that there are performance benefits to doing so,
especially under OOM conditions.
Thanx, Paul
On Wed, Mar 15, 2023 at 06:36:48PM -0400, Steven Rostedt wrote:
> On Thu, 9 Mar 2023 14:45:21 +0100
> Uladzislau Rezki <[email protected]> wrote:
>
> > > The kvfree_rcu()'s single argument name is deprecated therefore
> > > rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> > > underline that it is for sleepable contexts.
> > >
> > > Cc: Steven Rostedt (VMware) <[email protected]>
> > > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> > >
> > Could you please add you reviwed-by or Acked-by tags so we can bring
> > our series with renaming for the next merge window?
>
> I don't know. Perhaps we should just apply this patch and not worry about
> sleeping and whatnot.
Just in case it is not clear:
Acked-by: Paul E. McKenney <[email protected]>
> -- Steve
>
> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
> index 04f0fdae19a1..5de945a8f61d 100644
> --- a/kernel/trace/trace_osnoise.c
> +++ b/kernel/trace/trace_osnoise.c
> @@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS;
> struct osnoise_instance {
> struct list_head list;
> struct trace_array *tr;
> + struct rcu_head rcu;
> };
>
> static struct list_head osnoise_instances;
> @@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr)
> if (!found)
> return;
>
> - kvfree_rcu(inst);
> + kvfree_rcu(inst, rcu);
> }
>
> /*
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 20d0c4a97633..ef5fafb40c76 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp,
> return -ENOENT;
>
> list_del_rcu(&link->list);
> - kvfree_rcu(link);
> + kvfree_rcu(link, rcu);
>
> if (list_empty(&tp->event->files))
> trace_probe_clear_flag(tp, TP_FLAG_TRACE);
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index ef8ed3b65d05..e6037752dcf0 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -256,6 +256,7 @@ struct trace_probe {
> struct event_file_link {
> struct trace_event_file *file;
> struct list_head list;
> + struct rcu_head rcu;
> };
>
> static inline bool trace_probe_test_flag(struct trace_probe *tp,
On Wed, Mar 15, 2023 at 06:36:48PM -0400, Steven Rostedt wrote:
> On Thu, 9 Mar 2023 14:45:21 +0100
> Uladzislau Rezki <[email protected]> wrote:
>
> > > The kvfree_rcu()'s single argument name is deprecated therefore
> > > rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
> > > underline that it is for sleepable contexts.
> > >
> > > Cc: Steven Rostedt (VMware) <[email protected]>
> > > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> > >
> > Could you please add you reviwed-by or Acked-by tags so we can bring
> > our series with renaming for the next merge window?
>
> I don't know. Perhaps we should just apply this patch and not worry about
> sleeping and whatnot.
>
> -- Steve
>
> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
> index 04f0fdae19a1..5de945a8f61d 100644
> --- a/kernel/trace/trace_osnoise.c
> +++ b/kernel/trace/trace_osnoise.c
> @@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS;
> struct osnoise_instance {
> struct list_head list;
> struct trace_array *tr;
> + struct rcu_head rcu;
> };
>
> static struct list_head osnoise_instances;
> @@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr)
> if (!found)
> return;
>
> - kvfree_rcu(inst);
> + kvfree_rcu(inst, rcu);
> }
>
> /*
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 20d0c4a97633..ef5fafb40c76 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp,
> return -ENOENT;
>
> list_del_rcu(&link->list);
> - kvfree_rcu(link);
> + kvfree_rcu(link, rcu);
>
> if (list_empty(&tp->event->files))
> trace_probe_clear_flag(tp, TP_FLAG_TRACE);
> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> index ef8ed3b65d05..e6037752dcf0 100644
> --- a/kernel/trace/trace_probe.h
> +++ b/kernel/trace/trace_probe.h
> @@ -256,6 +256,7 @@ struct trace_probe {
> struct event_file_link {
> struct trace_event_file *file;
> struct list_head list;
> + struct rcu_head rcu;
> };
>
> static inline bool trace_probe_test_flag(struct trace_probe *tp,
>
Anyway i do not see any problems with it
Acked-by: Uladzislau Rezki (Sony) <[email protected]>
--
Uladzislau Rezki
> On Mar 16, 2023, at 1:58 PM, Uladzislau Rezki <[email protected]> wrote:
>
> On Wed, Mar 15, 2023 at 06:36:48PM -0400, Steven Rostedt wrote:
>> On Thu, 9 Mar 2023 14:45:21 +0100
>> Uladzislau Rezki <[email protected]> wrote:
>>
>>>> The kvfree_rcu()'s single argument name is deprecated therefore
>>>> rename it to kvfree_rcu_mightsleep() variant. The goal is explicitly
>>>> underline that it is for sleepable contexts.
>>>>
>>>> Cc: Steven Rostedt (VMware) <[email protected]>
>>>> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
>>>>
>>> Could you please add you reviwed-by or Acked-by tags so we can bring
>>> our series with renaming for the next merge window?
>>
>> I don't know. Perhaps we should just apply this patch and not worry about
>> sleeping and whatnot.
>>
>> -- Steve
>>
>> diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
>> index 04f0fdae19a1..5de945a8f61d 100644
>> --- a/kernel/trace/trace_osnoise.c
>> +++ b/kernel/trace/trace_osnoise.c
>> @@ -76,6 +76,7 @@ static unsigned long osnoise_options = OSN_DEFAULT_OPTIONS;
>> struct osnoise_instance {
>> struct list_head list;
>> struct trace_array *tr;
>> + struct rcu_head rcu;
>> };
>>
>> static struct list_head osnoise_instances;
>> @@ -159,7 +160,7 @@ static void osnoise_unregister_instance(struct trace_array *tr)
>> if (!found)
>> return;
>>
>> - kvfree_rcu(inst);
>> + kvfree_rcu(inst, rcu);
>> }
>>
>> /*
>> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
>> index 20d0c4a97633..ef5fafb40c76 100644
>> --- a/kernel/trace/trace_probe.c
>> +++ b/kernel/trace/trace_probe.c
>> @@ -1172,7 +1172,7 @@ int trace_probe_remove_file(struct trace_probe *tp,
>> return -ENOENT;
>>
>> list_del_rcu(&link->list);
>> - kvfree_rcu(link);
>> + kvfree_rcu(link, rcu);
>>
>> if (list_empty(&tp->event->files))
>> trace_probe_clear_flag(tp, TP_FLAG_TRACE);
>> diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
>> index ef8ed3b65d05..e6037752dcf0 100644
>> --- a/kernel/trace/trace_probe.h
>> +++ b/kernel/trace/trace_probe.h
>> @@ -256,6 +256,7 @@ struct trace_probe {
>> struct event_file_link {
>> struct trace_event_file *file;
>> struct list_head list;
>> + struct rcu_head rcu;
>> };
>>
>> static inline bool trace_probe_test_flag(struct trace_probe *tp,
>>
> Anyway i do not see any problems with it
>
> Acked-by: Uladzislau Rezki (Sony) <[email protected]>
Just to clarify, we are dropping the original patch and instead taking Steves version?
If so, Steve please send a patch when you are able to and with Vlads Ack,
we can take it via the RCU tree if that is Ok with you.
Thanks,
- Joel
>
> --
> Uladzislau Rezki
On Thu, Mar 16, 2023 at 04:05:09PM +0100, Uladzislau Rezki wrote:
> On Thu, Mar 16, 2023 at 09:56:53AM -0400, Steven Rostedt wrote:
> > On Thu, 16 Mar 2023 09:16:37 +0100
> > Uladzislau Rezki <[email protected]> wrote:
> >
> > > > diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h
> > > > index ef8ed3b65d05..e6037752dcf0 100644
> > > > --- a/kernel/trace/trace_probe.h
> > > > +++ b/kernel/trace/trace_probe.h
> > > > @@ -256,6 +256,7 @@ struct trace_probe {
> > > > struct event_file_link {
> > > > struct trace_event_file *file;
> > > > struct list_head list;
> > > > + struct rcu_head rcu;
> > > > };
> > > >
> > > > static inline bool trace_probe_test_flag(struct trace_probe *tp,
> > > >
> > > struct foo_a {
> > > int a;
> > > int b;
> > > };
> >
> > Most machines today are 64 bits, even low end machines.
> >
> > struct foo_a {
> > long long a;
> > long long b;
> > };
> >
> > is more accurate. That's 16 bytes.
> >
> > Although it is more likely off because list_head is a double pointer. But
> > let's just go with this, as the amount really doesn't matter here.
> >
> > >
> > > your obj size is 8 byte
> > >
> > > struct foo_b {
> > > struct rcu_head rcu;
> >
> > Isn't rcu_head defined as;
> >
> > struct callback_head {
> > struct callback_head *next;
> > void (*func)(struct callback_head *head);
> > } __attribute__((aligned(sizeof(void *))));
> > #define rcu_head callback_head
> >
> > Which makes it 8 not 16 on 32 bit as well?
> >
> > > int a;
> > > int b;
> > > };
> >
> > So it should be 8 + 8 = 16, on 32 bit and 16 + 16 = 32 on 64bit.
> >
> > >
> > > now it becomes 16 + 8 = 24 bytes. In reallity a foo_b object
> > > will be 32 bytes since there is no slab for 24 bytes:
> > >
> > > <snip>
> > > kmalloc-32 19840 19840 32 128 1 : tunables 0 0 0 : slabdata 155 155 0
> > > kmalloc-16 28857 28928 16 256 1 : tunables 0 0 0 : slabdata 113 113 0
> > > kmalloc-8 37376 37376 8 512 1 : tunables 0 0 0 : slabdata 73 73 0
> > > <snip>
> > >
> > > if we allocate 512 objects of foo_a it would be 4096 bytes
> > > in case of foo_b it is 24 * 512 = 12228 bytes.
> >
> > This is for probe events. We usually allocate 1, maybe 2. Oh, some may even
> > allocate 100 to be crazy. But each probe event is in reality much larger
> > (1K perhaps) as each one allocates dentry's, inodes, etc. So 8 or 16 bytes
> > extra is still lost in the noise.
> >
> > >
> > > single argument will give you 4096 + 512 * 8 = 8192 bytes
> > > int terms of memory consumtion.
> >
> > If someone allocate 512 instances, that would be closer to a meg in size
> > without this change. 8k is probably less than 1%
> >
> In percentage. My case. (12228 - 8192) * 100 / 12228 = ~33% difference.
>
> > >
> > > And double argument will not give you better performance comparing
> > > with a single argument.
> >
> > It will, because it will no longer have to allocate anything if need be.
> > Note, when it doesn't allocate the system is probably mostly idle and we
> > don't care about performance, but when it needs allocation, that's likely a
> > time when performance is a bit more important.
> >
> The problem further is about pointer chasing, like comparing arrays and
> lists. It will take longer time to offload all pointers.
>
Since i have a data, IMHO it is better to share than not:
--bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=3 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"
# double-argument 10 run
Total time taken by all kfree'ers: 4387872408 ns, loops: 10000, batches: 958, memory footprint: 40MB
Total time taken by all kfree'ers: 4415232304 ns, loops: 10000, batches: 982, memory footprint: 39MB
Total time taken by all kfree'ers: 4270303081 ns, loops: 10000, batches: 955, memory footprint: 42MB
Total time taken by all kfree'ers: 4364984147 ns, loops: 10000, batches: 953, memory footprint: 40MB
Total time taken by all kfree'ers: 4225994506 ns, loops: 10000, batches: 942, memory footprint: 40MB
Total time taken by all kfree'ers: 4601087346 ns, loops: 10000, batches: 1033, memory footprint: 40MB
Total time taken by all kfree'ers: 4853397855 ns, loops: 10000, batches: 1109, memory footprint: 38MB
Total time taken by all kfree'ers: 4627914204 ns, loops: 10000, batches: 1037, memory footprint: 39MB
Total time taken by all kfree'ers: 4274587317 ns, loops: 10000, batches: 938, memory footprint: 33MB
Total time taken by all kfree'ers: 4642151205 ns, loops: 10000, batches: 1068, memory footprint: 39MB
# single-argument 10 run
Total time taken by all kfree'ers: 3661190052 ns, loops: 10000, batches: 831, memory footprint: 29MB
Total time taken by all kfree'ers: 3616277061 ns, loops: 10000, batches: 780, memory footprint: 27MB
Total time taken by all kfree'ers: 3704584439 ns, loops: 10000, batches: 810, memory footprint: 27MB
Total time taken by all kfree'ers: 3631291959 ns, loops: 10000, batches: 812, memory footprint: 28MB
Total time taken by all kfree'ers: 3610490769 ns, loops: 10000, batches: 795, memory footprint: 27MB
Total time taken by all kfree'ers: 3595446243 ns, loops: 10000, batches: 825, memory footprint: 28MB
Total time taken by all kfree'ers: 3686252889 ns, loops: 10000, batches: 784, memory footprint: 27MB
Total time taken by all kfree'ers: 3821475275 ns, loops: 10000, batches: 869, memory footprint: 27MB
Total time taken by all kfree'ers: 3740407185 ns, loops: 10000, batches: 813, memory footprint: 28MB
Total time taken by all kfree'ers: 3646684795 ns, loops: 10000, batches: 780, memory footprint: 24MB
And yes, there are side effects. For example a low memory condition.
--
Uladzislau Rezki
On Thu, 16 Mar 2023 14:01:30 -0400
Joel Fernandes <[email protected]> wrote:
> If so, Steve please send a patch when you are able to and with Vlads Ack,
> we can take it via the RCU tree if that is Ok with you.
Daniel acked it, so you can just take it as he can be responsible for that code.
For the trace_probe.c if you get an ack from Masami, then you can take
the original patch as is.
-- Steve
On Sat, Mar 18, 2023 at 12:11:31PM -0400, Steven Rostedt wrote:
> On Thu, 16 Mar 2023 14:01:30 -0400
> Joel Fernandes <[email protected]> wrote:
>
> > If so, Steve please send a patch when you are able to and with Vlads Ack,
> > we can take it via the RCU tree if that is Ok with you.
>
> Daniel acked it, so you can just take it as he can be responsible for that code.
>
> For the trace_probe.c if you get an ack from Masami, then you can take
> the original patch as is.
Masami, do you Ack the changes of the original patch to trace_probe.c ?
Here it is again:
https://lore.kernel.org/lkml/[email protected]/
thanks,
- Joel