2021-11-24 11:03:25

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 0/9] Switch to single argument kvfree_rcu() API

Hi!

This is an attempt to replace places in the Linux kernel for using
single argument of kvfree_rcu() API. In short, a single argument
API allows something like below:

<snip>
void *ptr = kvmalloc(some_bytes, GFP_KERNEL);
if (ptr)
kvfree_rcu(ptr);
<snip>

i.e. to free a pointer after a grace period of object that does not
have any rcu_head field inside its structure. Due to lacking of such
mechanism users just do:

<snip>
synchronize_rcu();
kfree(p);
<snip>

if they do not want to have an rcu_head element in their data. Thus
this series replaces mentioned places to go with the single argument API.

It is based on the 5.16.0-rc2 also it can be applied on Paul's latest RCU
"dev" branch.

Thanks!

Uladzislau Rezki (2):
ext4: Switch to kvfree_rcu() API
fs: nfs: sysfs: Switch to kvfree_rcu() API

Uladzislau Rezki (Sony) (7):
ext4: Replace ext4_kvfree_array_rcu() by kvfree_rcu() API
drivers: Switch to kvfree_rcu() API
x86/mm: Switch to kvfree_rcu() API
net/tipc: Switch to kvfree_rcu() API
net/core: Switch to kvfree_rcu() API
module: Switch to kvfree_rcu() API
tracing: Switch to kvfree_rcu() API

arch/x86/mm/mmio-mod.c | 6 ++--
drivers/block/drbd/drbd_nl.c | 9 ++----
drivers/block/drbd/drbd_receiver.c | 6 ++--
drivers/block/drbd/drbd_state.c | 3 +-
drivers/block/rnbd/rnbd-srv.c | 3 +-
drivers/crypto/nx/nx-common-pseries.c | 3 +-
drivers/infiniband/hw/hfi1/sdma.c | 3 +-
drivers/ipack/carriers/tpci200.c | 3 +-
drivers/mfd/dln2.c | 6 ++--
drivers/misc/vmw_vmci/vmci_context.c | 6 ++--
drivers/misc/vmw_vmci/vmci_event.c | 3 +-
.../net/ethernet/chelsio/cxgb4/cxgb4_main.c | 3 +-
.../net/ethernet/mellanox/mlx5/core/en/qos.c | 3 +-
.../ethernet/mellanox/mlx5/core/fpga/tls.c | 3 +-
drivers/net/ethernet/mellanox/mlxsw/core.c | 3 +-
drivers/scsi/device_handler/scsi_dh_alua.c | 3 +-
drivers/scsi/device_handler/scsi_dh_rdac.c | 3 +-
drivers/staging/fwserial/fwserial.c | 3 +-
fs/ext4/ext4.h | 1 -
fs/ext4/mballoc.c | 2 +-
fs/ext4/resize.c | 31 ++-----------------
fs/ext4/super.c | 5 ++-
fs/nfs/sysfs.c | 7 ++---
kernel/module.c | 3 +-
kernel/trace/trace_osnoise.c | 3 +-
kernel/trace/trace_probe.c | 3 +-
net/core/sysctl_net_core.c | 3 +-
net/tipc/crypto.c | 3 +-
28 files changed, 37 insertions(+), 96 deletions(-)

--
2.30.2



2021-11-24 11:03:27

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 1/9] ext4: Switch to kvfree_rcu() API

From: Uladzislau Rezki <[email protected]>

Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.

CC: "Theodore Ts'o" <[email protected]>
Signed-off-by: Uladzislau Rezki <[email protected]>
---
fs/ext4/super.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 4e33b5eca694..111b0498a232 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1886,8 +1886,7 @@ static int clear_qf_name(struct super_block *sb, int qtype)
return -1;
}
rcu_assign_pointer(sbi->s_qf_names[qtype], NULL);
- synchronize_rcu();
- kfree(old_qname);
+ kvfree_rcu(old_qname);
return 1;
}
#endif
--
2.30.2


2021-11-24 11:03:35

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 3/9] fs: nfs: sysfs: Switch to kvfree_rcu() API

From: Uladzislau Rezki <[email protected]>

Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.

CC: Trond Myklebust <[email protected]>
Signed-off-by: Uladzislau Rezki <[email protected]>
---
fs/nfs/sysfs.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/sysfs.c b/fs/nfs/sysfs.c
index 8cb70755e3c9..ff88d5d58e1e 100644
--- a/fs/nfs/sysfs.c
+++ b/fs/nfs/sysfs.c
@@ -113,10 +113,9 @@ static ssize_t nfs_netns_identifier_store(struct kobject *kobj,
if (!p)
return -ENOMEM;
old = rcu_dereference_protected(xchg(&c->identifier, (char __rcu *)p), 1);
- if (old) {
- synchronize_rcu();
- kfree(old);
- }
+ if (old)
+ kvfree_rcu(old);
+
return count;
}

--
2.30.2


2021-11-24 11:03:39

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 2/9] ext4: Replace ext4_kvfree_array_rcu() by kvfree_rcu() API

The ext4_kvfree_array_rcu() function was introduced in order to
release some memory after a grace period during resizing of a
partition. An object that is freed does not contain any rcu_head
filed.

To do so, it requires to allocate some extra memory for a special
structure that has an rcu_head filed and pointer one where a freed
memory is attached. Finally call_rcu() API is invoked.

Since we have a single argument of kvfree_rcu() API, we can easily
replace all that tricky code by one single call that does the same
but in more efficient way.

CC: "Theodore Ts'o" <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
fs/ext4/ext4.h | 1 -
fs/ext4/mballoc.c | 2 +-
fs/ext4/resize.c | 31 ++-----------------------------
fs/ext4/super.c | 2 +-
4 files changed, 4 insertions(+), 32 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 404dd50856e5..7e8ff3ac2beb 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3089,7 +3089,6 @@ extern int ext4_generic_delete_entry(struct inode *dir,
extern bool ext4_empty_dir(struct inode *inode);

/* resize.c */
-extern void ext4_kvfree_array_rcu(void *to_free);
extern int ext4_group_add(struct super_block *sb,
struct ext4_new_group_data *input);
extern int ext4_group_extend(struct super_block *sb,
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 215b7068f548..b0469f7a5c55 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -3109,7 +3109,7 @@ int ext4_mb_alloc_groupinfo(struct super_block *sb, ext4_group_t ngroups)
rcu_assign_pointer(sbi->s_group_info, new_groupinfo);
sbi->s_group_info_size = size / sizeof(*sbi->s_group_info);
if (old_groupinfo)
- ext4_kvfree_array_rcu(old_groupinfo);
+ kvfree_rcu(old_groupinfo);
ext4_debug("allocated s_groupinfo array for %d meta_bg's\n",
sbi->s_group_info_size);
return 0;
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index b63cb88ccdae..ac6aa037aaab 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -17,33 +17,6 @@

#include "ext4_jbd2.h"

-struct ext4_rcu_ptr {
- struct rcu_head rcu;
- void *ptr;
-};
-
-static void ext4_rcu_ptr_callback(struct rcu_head *head)
-{
- struct ext4_rcu_ptr *ptr;
-
- ptr = container_of(head, struct ext4_rcu_ptr, rcu);
- kvfree(ptr->ptr);
- kfree(ptr);
-}
-
-void ext4_kvfree_array_rcu(void *to_free)
-{
- struct ext4_rcu_ptr *ptr = kzalloc(sizeof(*ptr), GFP_KERNEL);
-
- if (ptr) {
- ptr->ptr = to_free;
- call_rcu(&ptr->rcu, ext4_rcu_ptr_callback);
- return;
- }
- synchronize_rcu();
- kvfree(to_free);
-}
-
int ext4_resize_begin(struct super_block *sb)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -906,7 +879,7 @@ static int add_new_gdb(handle_t *handle, struct inode *inode,
n_group_desc[gdb_num] = gdb_bh;
rcu_assign_pointer(EXT4_SB(sb)->s_group_desc, n_group_desc);
EXT4_SB(sb)->s_gdb_count++;
- ext4_kvfree_array_rcu(o_group_desc);
+ kvfree_rcu(o_group_desc);

lock_buffer(EXT4_SB(sb)->s_sbh);
le16_add_cpu(&es->s_reserved_gdt_blocks, -1);
@@ -969,7 +942,7 @@ static int add_new_gdb_meta_bg(struct super_block *sb,

rcu_assign_pointer(EXT4_SB(sb)->s_group_desc, n_group_desc);
EXT4_SB(sb)->s_gdb_count++;
- ext4_kvfree_array_rcu(o_group_desc);
+ kvfree_rcu(o_group_desc);
return err;
}

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 111b0498a232..3942cd271a00 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2759,7 +2759,7 @@ int ext4_alloc_flex_bg_array(struct super_block *sb, ext4_group_t ngroup)
rcu_assign_pointer(sbi->s_flex_groups, new_groups);
sbi->s_flex_groups_allocated = size;
if (old_groups)
- ext4_kvfree_array_rcu(old_groups);
+ kvfree_rcu(old_groups);
return 0;
}

--
2.30.2


2021-11-24 11:03:52

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 6/9] net/tipc: Switch to kvfree_rcu() API

Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.

CC: Jon Maloy <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
net/tipc/crypto.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/tipc/crypto.c b/net/tipc/crypto.c
index b4d9419a015b..c2d16c40778d 100644
--- a/net/tipc/crypto.c
+++ b/net/tipc/crypto.c
@@ -2391,8 +2391,7 @@ static void tipc_crypto_work_rx(struct work_struct *work)
resched = true;
break;
default:
- synchronize_rcu();
- kfree(rx->skey);
+ kvfree_rcu(rx->skey);
rx->skey = NULL;
break;
}
--
2.30.2


2021-11-24 11:03:57

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 5/9] x86/mm: Switch to kvfree_rcu() API

Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.

CC: Steven Rostedt <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
arch/x86/mm/mmio-mod.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c
index 933a2ebad471..e75137a06c32 100644
--- a/arch/x86/mm/mmio-mod.c
+++ b/arch/x86/mm/mmio-mod.c
@@ -307,10 +307,8 @@ static void iounmap_trace_core(volatile void __iomem *addr)

not_enabled:
spin_unlock_irq(&trace_lock);
- if (found_trace) {
- synchronize_rcu(); /* unregister_kmmio_probe() requirement */
- kfree(found_trace);
- }
+ if (found_trace)
+ kvfree_rcu(found_trace); /* unregister_kmmio_probe() requirement */
}

void mmiotrace_iounmap(volatile void __iomem *addr)
--
2.30.2


2021-11-24 11:04:07

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 4/9] drivers: Switch to kvfree_rcu() API

Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.

CC: Philipp Reisner <[email protected]>
CC: "Md. Haris Iqbal" <[email protected]>
CC: Herbert Xu <[email protected]>
CC: Mike Marciniszyn <[email protected]>
CC: Samuel Iglesias Gonsalvez <[email protected]>
CC: Lee Jones <[email protected]>
CC: Jorgen Hansen <[email protected]>
CC: Raju Rangoju <[email protected]>
CC: Saeed Mahameed <[email protected]>
CC: Boris Pismenny <[email protected]>
CC: Jiri Pirko <[email protected]>
CC: "James E.J. Bottomley" <[email protected]>
CC: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
drivers/block/drbd/drbd_nl.c | 9 +++------
drivers/block/drbd/drbd_receiver.c | 6 ++----
drivers/block/drbd/drbd_state.c | 3 +--
drivers/block/rnbd/rnbd-srv.c | 3 +--
drivers/crypto/nx/nx-common-pseries.c | 3 +--
drivers/infiniband/hw/hfi1/sdma.c | 3 +--
drivers/ipack/carriers/tpci200.c | 3 +--
drivers/mfd/dln2.c | 6 ++----
drivers/misc/vmw_vmci/vmci_context.c | 6 ++----
drivers/misc/vmw_vmci/vmci_event.c | 3 +--
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 3 +--
drivers/net/ethernet/mellanox/mlx5/core/en/qos.c | 3 +--
drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c | 3 +--
drivers/net/ethernet/mellanox/mlxsw/core.c | 3 +--
drivers/scsi/device_handler/scsi_dh_alua.c | 3 +--
drivers/scsi/device_handler/scsi_dh_rdac.c | 3 +--
drivers/staging/fwserial/fwserial.c | 3 +--
17 files changed, 22 insertions(+), 44 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 44ccf8b4f4b2..28f4d84945fd 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1679,8 +1679,7 @@ int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info)
drbd_send_sync_param(peer_device);
}

- synchronize_rcu();
- kfree(old_disk_conf);
+ kvfree_rcu(old_disk_conf);
kfree(old_plan);
mod_timer(&device->request_timer, jiffies + HZ);
goto success;
@@ -2511,8 +2510,7 @@ int drbd_adm_net_opts(struct sk_buff *skb, struct genl_info *info)

mutex_unlock(&connection->resource->conf_update);
mutex_unlock(&connection->data.mutex);
- synchronize_rcu();
- kfree(old_net_conf);
+ kvfree_rcu(old_net_conf);

if (connection->cstate >= C_WF_REPORT_PARAMS) {
struct drbd_peer_device *peer_device;
@@ -2925,8 +2923,7 @@ int drbd_adm_resize(struct sk_buff *skb, struct genl_info *info)
new_disk_conf->disk_size = (sector_t)rs.resize_size;
rcu_assign_pointer(device->ldev->disk_conf, new_disk_conf);
mutex_unlock(&device->resource->conf_update);
- synchronize_rcu();
- kfree(old_disk_conf);
+ kvfree_rcu(old_disk_conf);
new_disk_conf = NULL;
}

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 1f740e42e457..d73a53242a75 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -3799,8 +3799,7 @@ static int receive_protocol(struct drbd_connection *connection, struct packet_in
drbd_info(connection, "peer data-integrity-alg: %s\n",
integrity_alg[0] ? integrity_alg : "(none)");

- synchronize_rcu();
- kfree(old_net_conf);
+ kvfree_rcu(old_net_conf);
return 0;

disconnect_rcu_unlock:
@@ -4168,8 +4167,7 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info

rcu_assign_pointer(device->ldev->disk_conf, new_disk_conf);
mutex_unlock(&connection->resource->conf_update);
- synchronize_rcu();
- kfree(old_disk_conf);
+ kvfree_rcu(old_disk_conf);

drbd_info(device, "Peer sets u_size to %lu sectors (old: %lu)\n",
(unsigned long)p_usize, (unsigned long)my_usize);
diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index b8a27818ab3f..826e496821c7 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -2071,8 +2071,7 @@ static int w_after_conn_state_ch(struct drbd_work *w, int unused)
conn_free_crypto(connection);
mutex_unlock(&connection->resource->conf_update);

- synchronize_rcu();
- kfree(old_conf);
+ kvfree_rcu(old_conf);
}

if (ns_max.susp_fen) {
diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c
index aafecfe97055..373dedd499b4 100644
--- a/drivers/block/rnbd/rnbd-srv.c
+++ b/drivers/block/rnbd/rnbd-srv.c
@@ -808,8 +808,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,

free_srv_sess_dev:
xa_erase(&srv_sess->index_idr, srv_sess_dev->device_id);
- synchronize_rcu();
- kfree(srv_sess_dev);
+ kvfree_rcu(srv_sess_dev);
srv_dev_put:
if (open_msg->access_mode != RNBD_ACCESS_RO) {
mutex_lock(&srv_dev->lock);
diff --git a/drivers/crypto/nx/nx-common-pseries.c b/drivers/crypto/nx/nx-common-pseries.c
index 4e304f6081e4..3faf30120ac9 100644
--- a/drivers/crypto/nx/nx-common-pseries.c
+++ b/drivers/crypto/nx/nx-common-pseries.c
@@ -1061,8 +1061,7 @@ static int nx842_probe(struct vio_dev *viodev,

rcu_assign_pointer(devdata, new_devdata);
spin_unlock_irqrestore(&devdata_mutex, flags);
- synchronize_rcu();
- kfree(old_devdata);
+ kvfree_rcu(old_devdata);

of_reconfig_notifier_register(&nx842_of_nb);

diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
index 2b6c24b7b586..0a0caf7360a4 100644
--- a/drivers/infiniband/hw/hfi1/sdma.c
+++ b/drivers/infiniband/hw/hfi1/sdma.c
@@ -1292,8 +1292,7 @@ void sdma_clean(struct hfi1_devdata *dd, size_t num_engines)
sdma_map_free(rcu_access_pointer(dd->sdma_map));
RCU_INIT_POINTER(dd->sdma_map, NULL);
spin_unlock_irq(&dd->sde_map_lock);
- synchronize_rcu();
- kfree(dd->per_sdma);
+ kvfree_rcu(dd->per_sdma);
dd->per_sdma = NULL;

if (dd->sdma_rht) {
diff --git a/drivers/ipack/carriers/tpci200.c b/drivers/ipack/carriers/tpci200.c
index cbfdadecb23b..de29fe594022 100644
--- a/drivers/ipack/carriers/tpci200.c
+++ b/drivers/ipack/carriers/tpci200.c
@@ -180,8 +180,7 @@ static int tpci200_free_irq(struct ipack_device *dev)
slot_irq = tpci200->slots[dev->slot].irq;
/* uninstall handler */
RCU_INIT_POINTER(tpci200->slots[dev->slot].irq, NULL);
- synchronize_rcu();
- kfree(slot_irq);
+ kvfree_rcu(slot_irq);
mutex_unlock(&tpci200->mutex);
return 0;
}
diff --git a/drivers/mfd/dln2.c b/drivers/mfd/dln2.c
index 852129ea0766..365e3e77cac4 100644
--- a/drivers/mfd/dln2.c
+++ b/drivers/mfd/dln2.c
@@ -179,10 +179,8 @@ void dln2_unregister_event_cb(struct platform_device *pdev, u16 id)

spin_unlock_irqrestore(&dln2->event_cb_lock, flags);

- if (found) {
- synchronize_rcu();
- kfree(i);
- }
+ if (found)
+ kvfree_rcu(i);
}
EXPORT_SYMBOL(dln2_unregister_event_cb);

diff --git a/drivers/misc/vmw_vmci/vmci_context.c b/drivers/misc/vmw_vmci/vmci_context.c
index c0b5e339d5a1..6cf3e21c7604 100644
--- a/drivers/misc/vmw_vmci/vmci_context.c
+++ b/drivers/misc/vmw_vmci/vmci_context.c
@@ -687,10 +687,8 @@ int vmci_ctx_remove_notification(u32 context_id, u32 remote_cid)
}
spin_unlock(&context->lock);

- if (found) {
- synchronize_rcu();
- kfree(notifier);
- }
+ if (found)
+ kvfree_rcu(notifier);

vmci_ctx_put(context);

diff --git a/drivers/misc/vmw_vmci/vmci_event.c b/drivers/misc/vmw_vmci/vmci_event.c
index e3436abf39f4..2100297c94ad 100644
--- a/drivers/misc/vmw_vmci/vmci_event.c
+++ b/drivers/misc/vmw_vmci/vmci_event.c
@@ -209,8 +209,7 @@ int vmci_event_unsubscribe(u32 sub_id)
if (!s)
return VMCI_ERROR_NOT_FOUND;

- synchronize_rcu();
- kfree(s);
+ kvfree_rcu(s);

return VMCI_SUCCESS;
}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index dde1cf51d0ab..0619cb94f0e0 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -7190,8 +7190,7 @@ static void remove_one(struct pci_dev *pdev)
}
pci_release_regions(pdev);
kfree(adapter->mbox_log);
- synchronize_rcu();
- kfree(adapter);
+ kvfree_rcu(adapter);
}

/* "Shutdown" quiesces the device, stopping Ingress Packet and Interrupt
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
index 50977f01a050..eaaf7e33f90b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
@@ -388,8 +388,7 @@ static int mlx5e_qos_alloc_queues(struct mlx5e_priv *priv, struct mlx5e_channels
sqs = rcu_replace_pointer(chs->c[i]->qos_sqs, NULL,
lockdep_is_held(&priv->state_lock));

- synchronize_rcu(); /* Sync with NAPI. */
- kvfree(sqs);
+ kvfree_rcu(sqs); /* Sync with NAPI. */
}
return -ENOMEM;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c b/drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c
index 29b7339ebfa3..e501eed7fe9f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c
@@ -253,7 +253,7 @@ static void mlx5_fpga_tls_send_teardown_cmd(struct mlx5_core_dev *mdev,
MLX5_SET(tls_cmd, cmd, swid, swid);

mlx5_fpga_tls_flow_to_cmd(flow, cmd);
- kfree(flow);
+ kvfree_rcu(flow);

buf->sg[0].data = cmd;
buf->sg[0].size = MLX5_TLS_COMMAND_SIZE;
@@ -283,7 +283,6 @@ void mlx5_fpga_tls_del_flow(struct mlx5_core_dev *mdev, u32 swid,
return;
}

- synchronize_rcu(); /* before kfree(flow) */
mlx5_fpga_tls_send_teardown_cmd(mdev, flow, swid, flags);
}

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index 3fd3812b8f31..47c29769759b 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -2190,8 +2190,7 @@ void mlxsw_core_rx_listener_unregister(struct mlxsw_core *mlxsw_core,
if (!rxl_item)
return;
list_del_rcu(&rxl_item->list);
- synchronize_rcu();
- kfree(rxl_item);
+ kvfree_rcu(rxl_item);
}
EXPORT_SYMBOL(mlxsw_core_rx_listener_unregister);

diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index 37d06f993b76..308246ce346a 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -1238,8 +1238,7 @@ static void alua_bus_detach(struct scsi_device *sdev)
kref_put(&pg->kref, release_port_group);
}
sdev->handler_data = NULL;
- synchronize_rcu();
- kfree(h);
+ kvfree_rcu(h);
}

static struct scsi_device_handler alua_dh = {
diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c
index 66652ab409cc..dc687021ff3a 100644
--- a/drivers/scsi/device_handler/scsi_dh_rdac.c
+++ b/drivers/scsi/device_handler/scsi_dh_rdac.c
@@ -782,8 +782,7 @@ static void rdac_bus_detach( struct scsi_device *sdev )
}
spin_unlock(&list_lock);
sdev->handler_data = NULL;
- synchronize_rcu();
- kfree(h);
+ kvfree_rcu(h);
}

static struct scsi_device_handler rdac_dh = {
diff --git a/drivers/staging/fwserial/fwserial.c b/drivers/staging/fwserial/fwserial.c
index e8fa7f53cd5e..c8539c4f5bea 100644
--- a/drivers/staging/fwserial/fwserial.c
+++ b/drivers/staging/fwserial/fwserial.c
@@ -2116,8 +2116,7 @@ static void fwserial_remove_peer(struct fwtty_peer *peer)
if (port)
fwserial_release_port(port, true);

- synchronize_rcu();
- kfree(peer);
+ kvfree_rcu(peer);
}

/**
--
2.30.2


2021-11-24 11:04:15

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 7/9] net/core: Switch to kvfree_rcu() API

Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.

CC: "David S. Miller" <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
net/core/sysctl_net_core.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 5f88526ad61c..8b4ab9a6e2d7 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -141,8 +141,7 @@ static int flow_limit_cpu_sysctl(struct ctl_table *table, int write,
lockdep_is_held(&flow_limit_update_mutex));
if (cur && !cpumask_test_cpu(i, mask)) {
RCU_INIT_POINTER(sd->flow_limit, NULL);
- synchronize_rcu();
- kfree(cur);
+ kvfree_rcu(cur);
} else if (!cur && cpumask_test_cpu(i, mask)) {
cur = kzalloc_node(len, GFP_KERNEL,
cpu_to_node(i));
--
2.30.2


2021-11-24 11:04:24

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 9/9] tracing: Switch to kvfree_rcu() API

Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.

CC: Steven Rostedt <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
kernel/trace/trace_osnoise.c | 3 +--
kernel/trace/trace_probe.c | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 7520d43aed55..4719a848bf17 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -138,8 +138,7 @@ static void osnoise_unregister_instance(struct trace_array *tr)
if (!found)
return;

- synchronize_rcu();
- kfree(inst);
+ kvfree_rcu(inst);
}

/*
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 3ed2a3f37297..8a3822818bf8 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -1138,8 +1138,7 @@ int trace_probe_remove_file(struct trace_probe *tp,
return -ENOENT;

list_del_rcu(&link->list);
- synchronize_rcu();
- kfree(link);
+ kvfree_rcu(link);

if (list_empty(&tp->event->files))
trace_probe_clear_flag(tp, TP_FLAG_TRACE);
--
2.30.2


2021-11-24 11:04:26

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: [PATCH 8/9] module: Switch to kvfree_rcu() API

Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.

CC: Luis Chamberlain <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
---
kernel/module.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index 84a9141a5e15..f404f0c9f385 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -4150,8 +4150,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
ddebug_cleanup:
ftrace_release_mod(mod);
dynamic_debug_remove(mod, info->debug);
- synchronize_rcu();
- kfree(mod->args);
+ kvfree_rcu(mod->args);
free_arch_cleanup:
cfi_cleanup(mod);
module_arch_cleanup(mod);
--
2.30.2


2021-11-24 14:58:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 5/9] x86/mm: Switch to kvfree_rcu() API

On Wed, 24 Nov 2021 12:03:04 +0100
"Uladzislau Rezki (Sony)" <[email protected]> wrote:

> diff --git a/arch/x86/mm/mmio-mod.c b/arch/x86/mm/mmio-mod.c
> index 933a2ebad471..e75137a06c32 100644
> --- a/arch/x86/mm/mmio-mod.c
> +++ b/arch/x86/mm/mmio-mod.c
> @@ -307,10 +307,8 @@ static void iounmap_trace_core(volatile void __iomem *addr)
>
> not_enabled:
> spin_unlock_irq(&trace_lock);
> - if (found_trace) {
> - synchronize_rcu(); /* unregister_kmmio_probe() requirement */
> - kfree(found_trace);
> - }
> + if (found_trace)
> + kvfree_rcu(found_trace); /* unregister_kmmio_probe() requirement */
> }
>

This is the first I've seen kvfree_rcu() (that I actually noticed/remember,
I'm sure I probably was Cc'd on some patches). And I find the comment
around it very confusing:

Specifically:


* kvfree_rcu(ptr);
*
* where @ptr is a pointer to kvfree().

The above suggests that you should pass a pointer to the actual function
kvfree to kvfree_rcu(), which is not what I believe is to be done.

i.e. kvfree_rcu(kvfree) ???

Perhaps rewrite that to say:

* where @ptr is the pointer to be freed by kvfree().

?

Other than that, the patch looks fine to me.

Acked-by: Steven Rostedt (VMware) <[email protected]>

-- Steve

2021-11-24 14:59:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 5/9] x86/mm: Switch to kvfree_rcu() API

On Wed, 24 Nov 2021 09:58:03 -0500
Steven Rostedt <[email protected]> wrote:

> Other than that, the patch looks fine to me.
>
> Acked-by: Steven Rostedt (VMware) <[email protected]>

This was suppose to be for patch 9/9:

[PATCH 9/9] tracing: Switch to kvfree_rcu() API

I replied to the wrong patch.

-- Steve

2021-11-24 15:00:28

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 9/9] tracing: Switch to kvfree_rcu() API

On Wed, 24 Nov 2021 12:03:08 +0100
"Uladzislau Rezki (Sony)" <[email protected]> wrote:

> Instead of invoking a synchronize_rcu() to free a pointer
> after a grace period we can directly make use of new API
> that does the same but in more efficient way.
>
> CC: Steven Rostedt <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> ---

Replying to the correct patch this time.

Acked-by: Steven Rostedt (VMware) <[email protected]>

-- Steve

2021-11-24 18:01:20

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: Re: [PATCH 5/9] x86/mm: Switch to kvfree_rcu() API

>
> This is the first I've seen kvfree_rcu() (that I actually noticed/remember,
> I'm sure I probably was Cc'd on some patches). And I find the comment
> around it very confusing:
>
> Specifically:
>
>
> * kvfree_rcu(ptr);
> *
> * where @ptr is a pointer to kvfree().
>
> The above suggests that you should pass a pointer to the actual function
> kvfree to kvfree_rcu(), which is not what I believe is to be done.
>
> i.e. kvfree_rcu(kvfree) ???
>
> Perhaps rewrite that to say:
>
> * where @ptr is the pointer to be freed by kvfree().
>
> ?
Indeed. I will fix that by sending a patch to change a description to
be less confusing :)

>
> Other than that, the patch looks fine to me.
>
> Acked-by: Steven Rostedt (VMware) <[email protected]>
>
I will place your "Acked-by" to 9/9 one.

Thanks!

--
Uladzislau Rezki

2021-11-24 18:03:18

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: Re: [PATCH 9/9] tracing: Switch to kvfree_rcu() API

>
> On Wed, 24 Nov 2021 12:03:08 +0100
> "Uladzislau Rezki (Sony)" <[email protected]> wrote:
>
> > Instead of invoking a synchronize_rcu() to free a pointer
> > after a grace period we can directly make use of new API
> > that does the same but in more efficient way.
> >
> > CC: Steven Rostedt <[email protected]>
> > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> > ---
>
> Replying to the correct patch this time.
>
> Acked-by: Steven Rostedt (VMware) <[email protected]>
>
Thanks!

--
Uladzislau Rezki

2021-11-29 14:19:15

by Lee Jones

[permalink] [raw]
Subject: Re: [PATCH 4/9] drivers: Switch to kvfree_rcu() API

On Wed, 24 Nov 2021, Uladzislau Rezki (Sony) wrote:

> Instead of invoking a synchronize_rcu() to free a pointer
> after a grace period we can directly make use of new API
> that does the same but in more efficient way.
>
> CC: Philipp Reisner <[email protected]>
> CC: "Md. Haris Iqbal" <[email protected]>
> CC: Herbert Xu <[email protected]>
> CC: Mike Marciniszyn <[email protected]>
> CC: Samuel Iglesias Gonsalvez <[email protected]>
> CC: Lee Jones <[email protected]>
> CC: Jorgen Hansen <[email protected]>
> CC: Raju Rangoju <[email protected]>
> CC: Saeed Mahameed <[email protected]>
> CC: Boris Pismenny <[email protected]>
> CC: Jiri Pirko <[email protected]>
> CC: "James E.J. Bottomley" <[email protected]>
> CC: Greg Kroah-Hartman <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> ---
> drivers/block/drbd/drbd_nl.c | 9 +++------
> drivers/block/drbd/drbd_receiver.c | 6 ++----
> drivers/block/drbd/drbd_state.c | 3 +--
> drivers/block/rnbd/rnbd-srv.c | 3 +--
> drivers/crypto/nx/nx-common-pseries.c | 3 +--
> drivers/infiniband/hw/hfi1/sdma.c | 3 +--
> drivers/ipack/carriers/tpci200.c | 3 +--

> drivers/mfd/dln2.c | 6 ++----

I'm not an expert in this API, but the premise and changes to MFD seem
fine at first glance:

Acked-by: Lee Jones <[email protected]>

> drivers/misc/vmw_vmci/vmci_context.c | 6 ++----
> drivers/misc/vmw_vmci/vmci_event.c | 3 +--
> drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 3 +--
> drivers/net/ethernet/mellanox/mlx5/core/en/qos.c | 3 +--
> drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c | 3 +--
> drivers/net/ethernet/mellanox/mlxsw/core.c | 3 +--
> drivers/scsi/device_handler/scsi_dh_alua.c | 3 +--
> drivers/scsi/device_handler/scsi_dh_rdac.c | 3 +--
> drivers/staging/fwserial/fwserial.c | 3 +--
> 17 files changed, 22 insertions(+), 44 deletions(-)

--
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

2021-11-29 17:21:16

by Marciniszyn, Mike

[permalink] [raw]
Subject: RE: [PATCH 4/9] drivers: Switch to kvfree_rcu() API

> Subject: Re: [PATCH 4/9] drivers: Switch to kvfree_rcu() API
>
> On Wed, 24 Nov 2021, Uladzislau Rezki (Sony) wrote:
>
> > Instead of invoking a synchronize_rcu() to free a pointer after a
> > grace period we can directly make use of new API that does the same
> > but in more efficient way.
> >
+--
> > drivers/infiniband/hw/hfi1/sdma.c | 3 +--

The sdma portion of the change looks good!

Mike

Acked-by: Mike Marciniszyn <[email protected]>

2021-11-29 18:54:21

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: Re: [PATCH 4/9] drivers: Switch to kvfree_rcu() API

On Mon, Nov 29, 2021 at 1:58 PM Lee Jones <[email protected]> wrote:
>
> On Wed, 24 Nov 2021, Uladzislau Rezki (Sony) wrote:
>
> > Instead of invoking a synchronize_rcu() to free a pointer
> > after a grace period we can directly make use of new API
> > that does the same but in more efficient way.
> >
> > CC: Philipp Reisner <[email protected]>
> > CC: "Md. Haris Iqbal" <[email protected]>
> > CC: Herbert Xu <[email protected]>
> > CC: Mike Marciniszyn <[email protected]>
> > CC: Samuel Iglesias Gonsalvez <[email protected]>
> > CC: Lee Jones <[email protected]>
> > CC: Jorgen Hansen <[email protected]>
> > CC: Raju Rangoju <[email protected]>
> > CC: Saeed Mahameed <[email protected]>
> > CC: Boris Pismenny <[email protected]>
> > CC: Jiri Pirko <[email protected]>
> > CC: "James E.J. Bottomley" <[email protected]>
> > CC: Greg Kroah-Hartman <[email protected]>
> > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> > ---
> > drivers/block/drbd/drbd_nl.c | 9 +++------
> > drivers/block/drbd/drbd_receiver.c | 6 ++----
> > drivers/block/drbd/drbd_state.c | 3 +--
> > drivers/block/rnbd/rnbd-srv.c | 3 +--
> > drivers/crypto/nx/nx-common-pseries.c | 3 +--
> > drivers/infiniband/hw/hfi1/sdma.c | 3 +--
> > drivers/ipack/carriers/tpci200.c | 3 +--
>
> > drivers/mfd/dln2.c | 6 ++----
>
> I'm not an expert in this API, but the premise and changes to MFD seem
> fine at first glance:
>
> Acked-by: Lee Jones <[email protected]>
>
Thanks!

--
Uladzislau Rezki

2021-11-30 10:39:13

by Miroslav Benes

[permalink] [raw]
Subject: Re: [PATCH 8/9] module: Switch to kvfree_rcu() API

Hi,

On Wed, 24 Nov 2021, Uladzislau Rezki (Sony) wrote:

> Instead of invoking a synchronize_rcu() to free a pointer
> after a grace period we can directly make use of new API
> that does the same but in more efficient way.
>
> CC: Luis Chamberlain <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> ---
> kernel/module.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/module.c b/kernel/module.c
> index 84a9141a5e15..f404f0c9f385 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -4150,8 +4150,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
> ddebug_cleanup:
> ftrace_release_mod(mod);
> dynamic_debug_remove(mod, info->debug);
> - synchronize_rcu();
> - kfree(mod->args);
> + kvfree_rcu(mod->args);
> free_arch_cleanup:
> cfi_cleanup(mod);
> module_arch_cleanup(mod);

hm, if I am not missing something, synchronize_rcu() is not really
connected to kfree(mod->args) there. synchronize_rcu() was added a long
time ago when kernel/module.c removed stop_machine() from the code and
replaced it with RCU to protect (at least?) mod->list. You can find
list_del_rcu(&mod->list) a couple of lines below.

And yes, one could ask how this all works. The error/cleanup sequence in
load_module() is a giant mess... well, load_module() is a mess too, but
the error path is really not nice.

Miroslav

Subject: Re: [PATCH 8/9] module: Switch to kvfree_rcu() API

On 2021-11-30 11:39:09 [+0100], Miroslav Benes wrote:
> > --- a/kernel/module.c
> > +++ b/kernel/module.c
> > @@ -4150,8 +4150,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
> > ddebug_cleanup:
> > ftrace_release_mod(mod);
> > dynamic_debug_remove(mod, info->debug);
> > - synchronize_rcu();
> > - kfree(mod->args);
> > + kvfree_rcu(mod->args);
> > free_arch_cleanup:
> > cfi_cleanup(mod);
> > module_arch_cleanup(mod);
>
> hm, if I am not missing something, synchronize_rcu() is not really
> connected to kfree(mod->args) there. synchronize_rcu() was added a long
> time ago when kernel/module.c removed stop_machine() from the code and
> replaced it with RCU to protect (at least?) mod->list. You can find
> list_del_rcu(&mod->list) a couple of lines below.

so instead synchronize_rcu() + kfree() you could do
call_rcu(&mod->args->rcu, kfree()) but since you have no RCU-head around
in args you wait for the grace period and then invoke kfree.

kvfree_rcu() is somehow like call_rcu() + kfree() but without the needed
RCU-head.
So you avoid waiting for the grace period but mod->args is freed later,
after as expected.

> And yes, one could ask how this all works. The error/cleanup sequence in
> load_module() is a giant mess... well, load_module() is a mess too, but
> the error path is really not nice.

Well, spring is coming :)

> Miroslav

Sebastian

2021-12-01 09:24:25

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: Re: [PATCH 8/9] module: Switch to kvfree_rcu() API

On Tue, Nov 30, 2021 at 12:04:19PM +0100, Sebastian Andrzej Siewior wrote:
> On 2021-11-30 11:39:09 [+0100], Miroslav Benes wrote:
> > > --- a/kernel/module.c
> > > +++ b/kernel/module.c
> > > @@ -4150,8 +4150,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
> > > ddebug_cleanup:
> > > ftrace_release_mod(mod);
> > > dynamic_debug_remove(mod, info->debug);
> > > - synchronize_rcu();
> > > - kfree(mod->args);
> > > + kvfree_rcu(mod->args);
> > > free_arch_cleanup:
> > > cfi_cleanup(mod);
> > > module_arch_cleanup(mod);
> >
> > hm, if I am not missing something, synchronize_rcu() is not really
> > connected to kfree(mod->args) there. synchronize_rcu() was added a long
> > time ago when kernel/module.c removed stop_machine() from the code and
> > replaced it with RCU to protect (at least?) mod->list. You can find
> > list_del_rcu(&mod->list) a couple of lines below.
>
> so instead synchronize_rcu() + kfree() you could do
> call_rcu(&mod->args->rcu, kfree()) but since you have no RCU-head around
> in args you wait for the grace period and then invoke kfree.
>
> kvfree_rcu() is somehow like call_rcu() + kfree() but without the needed
> RCU-head.
> So you avoid waiting for the grace period but mod->args is freed later,
> after as expected.
>
> > And yes, one could ask how this all works. The error/cleanup sequence in
> > load_module() is a giant mess... well, load_module() is a mess too, but
> > the error path is really not nice.
>
> Well, spring is coming :)
>
Indeed that error path sequence is terrible. I will double check if that
synchronize_rcu() + kfree() is related to any RCU protection and freeing.

If it is not i will drop this patch.


Thanks!

--
Vlad Rezki

2021-12-01 20:35:41

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: Re: [PATCH 8/9] module: Switch to kvfree_rcu() API

On Wed, Dec 01, 2021 at 10:24:15AM +0100, Uladzislau Rezki wrote:
> On Tue, Nov 30, 2021 at 12:04:19PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2021-11-30 11:39:09 [+0100], Miroslav Benes wrote:
> > > > --- a/kernel/module.c
> > > > +++ b/kernel/module.c
> > > > @@ -4150,8 +4150,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
> > > > ddebug_cleanup:
> > > > ftrace_release_mod(mod);
> > > > dynamic_debug_remove(mod, info->debug);
> > > > - synchronize_rcu();
> > > > - kfree(mod->args);
> > > > + kvfree_rcu(mod->args);
> > > > free_arch_cleanup:
> > > > cfi_cleanup(mod);
> > > > module_arch_cleanup(mod);
> > >
> > > hm, if I am not missing something, synchronize_rcu() is not really
> > > connected to kfree(mod->args) there. synchronize_rcu() was added a long
> > > time ago when kernel/module.c removed stop_machine() from the code and
> > > replaced it with RCU to protect (at least?) mod->list. You can find
> > > list_del_rcu(&mod->list) a couple of lines below.
> >
> > so instead synchronize_rcu() + kfree() you could do
> > call_rcu(&mod->args->rcu, kfree()) but since you have no RCU-head around
> > in args you wait for the grace period and then invoke kfree.
> >
> > kvfree_rcu() is somehow like call_rcu() + kfree() but without the needed
> > RCU-head.
> > So you avoid waiting for the grace period but mod->args is freed later,
> > after as expected.
> >
> > > And yes, one could ask how this all works. The error/cleanup sequence in
> > > load_module() is a giant mess... well, load_module() is a mess too, but
> > > the error path is really not nice.
> >
> > Well, spring is coming :)
> >
> Indeed that error path sequence is terrible. I will double check if that
> synchronize_rcu() + kfree() is related to any RCU protection and freeing.
>
> If it is not i will drop this patch.
>
>
OK, that kfree has nothing to do with RCU protection:

<snip>
commit 6526c534b2677ca601b7b92851437feb041d02a1
Author: Rusty Russell <[email protected]>
Date: Thu Aug 5 12:59:10 2010 -0600

module: move module args strndup_user to just before use

Instead of copying and allocating the args and storing it in
load_info, we can just allocate them right before we need them.

Signed-off-by: Rusty Russell <[email protected]>
<snip>

so, i will drop my patch, since the intention is not to free a ptr
after a grace period.

Thank you for the review!

--
Vlad Rezki

2021-12-08 14:36:27

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: Re: [PATCH 1/9] ext4: Switch to kvfree_rcu() API

+ [email protected]

On Wed, Nov 24, 2021 at 12:03 PM Uladzislau Rezki (Sony)
<[email protected]> wrote:
>
> From: Uladzislau Rezki <[email protected]>
>
> Instead of invoking a synchronize_rcu() to free a pointer
> after a grace period we can directly make use of new API
> that does the same but in more efficient way.
>
> CC: "Theodore Ts'o" <[email protected]>
> Signed-off-by: Uladzislau Rezki <[email protected]>
> ---
> fs/ext4/super.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 4e33b5eca694..111b0498a232 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1886,8 +1886,7 @@ static int clear_qf_name(struct super_block *sb, int qtype)
> return -1;
> }
> rcu_assign_pointer(sbi->s_qf_names[qtype], NULL);
> - synchronize_rcu();
> - kfree(old_qname);
> + kvfree_rcu(old_qname);
> return 1;
> }
> #endif
> --
> 2.30.2
>


--
Uladzislau Rezki

2021-12-08 14:37:25

by Uladzislau Rezki (Sony)

[permalink] [raw]
Subject: Re: [PATCH 2/9] ext4: Replace ext4_kvfree_array_rcu() by kvfree_rcu() API

+ [email protected]

On Wed, Nov 24, 2021 at 12:03 PM Uladzislau Rezki (Sony)
<[email protected]> wrote:
>
> The ext4_kvfree_array_rcu() function was introduced in order to
> release some memory after a grace period during resizing of a
> partition. An object that is freed does not contain any rcu_head
> filed.
>
> To do so, it requires to allocate some extra memory for a special
> structure that has an rcu_head filed and pointer one where a freed
> memory is attached. Finally call_rcu() API is invoked.
>
> Since we have a single argument of kvfree_rcu() API, we can easily
> replace all that tricky code by one single call that does the same
> but in more efficient way.
>
> CC: "Theodore Ts'o" <[email protected]>
> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> ---
> fs/ext4/ext4.h | 1 -
> fs/ext4/mballoc.c | 2 +-
> fs/ext4/resize.c | 31 ++-----------------------------
> fs/ext4/super.c | 2 +-
> 4 files changed, 4 insertions(+), 32 deletions(-)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 404dd50856e5..7e8ff3ac2beb 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -3089,7 +3089,6 @@ extern int ext4_generic_delete_entry(struct inode *dir,
> extern bool ext4_empty_dir(struct inode *inode);
>
> /* resize.c */
> -extern void ext4_kvfree_array_rcu(void *to_free);
> extern int ext4_group_add(struct super_block *sb,
> struct ext4_new_group_data *input);
> extern int ext4_group_extend(struct super_block *sb,
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 215b7068f548..b0469f7a5c55 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -3109,7 +3109,7 @@ int ext4_mb_alloc_groupinfo(struct super_block *sb, ext4_group_t ngroups)
> rcu_assign_pointer(sbi->s_group_info, new_groupinfo);
> sbi->s_group_info_size = size / sizeof(*sbi->s_group_info);
> if (old_groupinfo)
> - ext4_kvfree_array_rcu(old_groupinfo);
> + kvfree_rcu(old_groupinfo);
> ext4_debug("allocated s_groupinfo array for %d meta_bg's\n",
> sbi->s_group_info_size);
> return 0;
> diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
> index b63cb88ccdae..ac6aa037aaab 100644
> --- a/fs/ext4/resize.c
> +++ b/fs/ext4/resize.c
> @@ -17,33 +17,6 @@
>
> #include "ext4_jbd2.h"
>
> -struct ext4_rcu_ptr {
> - struct rcu_head rcu;
> - void *ptr;
> -};
> -
> -static void ext4_rcu_ptr_callback(struct rcu_head *head)
> -{
> - struct ext4_rcu_ptr *ptr;
> -
> - ptr = container_of(head, struct ext4_rcu_ptr, rcu);
> - kvfree(ptr->ptr);
> - kfree(ptr);
> -}
> -
> -void ext4_kvfree_array_rcu(void *to_free)
> -{
> - struct ext4_rcu_ptr *ptr = kzalloc(sizeof(*ptr), GFP_KERNEL);
> -
> - if (ptr) {
> - ptr->ptr = to_free;
> - call_rcu(&ptr->rcu, ext4_rcu_ptr_callback);
> - return;
> - }
> - synchronize_rcu();
> - kvfree(to_free);
> -}
> -
> int ext4_resize_begin(struct super_block *sb)
> {
> struct ext4_sb_info *sbi = EXT4_SB(sb);
> @@ -906,7 +879,7 @@ static int add_new_gdb(handle_t *handle, struct inode *inode,
> n_group_desc[gdb_num] = gdb_bh;
> rcu_assign_pointer(EXT4_SB(sb)->s_group_desc, n_group_desc);
> EXT4_SB(sb)->s_gdb_count++;
> - ext4_kvfree_array_rcu(o_group_desc);
> + kvfree_rcu(o_group_desc);
>
> lock_buffer(EXT4_SB(sb)->s_sbh);
> le16_add_cpu(&es->s_reserved_gdt_blocks, -1);
> @@ -969,7 +942,7 @@ static int add_new_gdb_meta_bg(struct super_block *sb,
>
> rcu_assign_pointer(EXT4_SB(sb)->s_group_desc, n_group_desc);
> EXT4_SB(sb)->s_gdb_count++;
> - ext4_kvfree_array_rcu(o_group_desc);
> + kvfree_rcu(o_group_desc);
> return err;
> }
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 111b0498a232..3942cd271a00 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -2759,7 +2759,7 @@ int ext4_alloc_flex_bg_array(struct super_block *sb, ext4_group_t ngroup)
> rcu_assign_pointer(sbi->s_flex_groups, new_groups);
> sbi->s_flex_groups_allocated = size;
> if (old_groups)
> - ext4_kvfree_array_rcu(old_groups);
> + kvfree_rcu(old_groups);
> return 0;
> }
>
> --
> 2.30.2
>


--
Uladzislau Rezki

2021-12-08 15:24:42

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 4/9] drivers: Switch to kvfree_rcu() API

On Wed, Nov 24, 2021 at 12:03:03PM +0100, Uladzislau Rezki (Sony) wrote:
> Instead of invoking a synchronize_rcu() to free a pointer
> after a grace period we can directly make use of new API
> that does the same but in more efficient way.

It isn't entirely new, kfree_rcu() has been around for ages and any of
these call sites could have made use of it if they wanted. The
kvfree_rcu() just adds the twist of transparently allocating memory.

We really must ask in each case why the original author didn't use
kfree_rcu()..

> drivers/block/drbd/drbd_nl.c | 9 +++------
> drivers/block/drbd/drbd_receiver.c | 6 ++----
> drivers/block/drbd/drbd_state.c | 3 +--
> drivers/block/rnbd/rnbd-srv.c | 3 +--
> drivers/crypto/nx/nx-common-pseries.c | 3 +--
> drivers/infiniband/hw/hfi1/sdma.c | 3 +--
> drivers/ipack/carriers/tpci200.c | 3 +--
> drivers/mfd/dln2.c | 6 ++----
> drivers/misc/vmw_vmci/vmci_context.c | 6 ++----
> drivers/misc/vmw_vmci/vmci_event.c | 3 +--
> drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 3 +--
> drivers/net/ethernet/mellanox/mlx5/core/en/qos.c | 3 +--
> drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c | 3 +--
> drivers/net/ethernet/mellanox/mlxsw/core.c | 3 +--
> drivers/scsi/device_handler/scsi_dh_alua.c | 3 +--
> drivers/scsi/device_handler/scsi_dh_rdac.c | 3 +--
> drivers/staging/fwserial/fwserial.c | 3 +--

These all need to be split to single patches and ack'ed by experts.

> diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
> index 44ccf8b4f4b2..28f4d84945fd 100644
> +++ b/drivers/block/drbd/drbd_nl.c
> @@ -1679,8 +1679,7 @@ int drbd_adm_disk_opts(struct sk_buff *skb, struct genl_info *info)
> drbd_send_sync_param(peer_device);
> }
>
> - synchronize_rcu();
> - kfree(old_disk_conf);
> + kvfree_rcu(old_disk_conf);
> kfree(old_plan);

For instance, this, how do you know that old_plan isn't also RCU
protected?

> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> index dde1cf51d0ab..0619cb94f0e0 100644
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> @@ -7190,8 +7190,7 @@ static void remove_one(struct pci_dev *pdev)
> }
> pci_release_regions(pdev);
> kfree(adapter->mbox_log);
> - synchronize_rcu();
> - kfree(adapter);
> + kvfree_rcu(adapter);
> }

And this, for instance, just looks crazy! There is only one RCU region
in this file and it is not protecting an adaptor pointer, it is
protecting a netdev. No idea what this is trying to do today even.

Each case needs to be audited to make sure the synchronize_rcu() is
only protecting the kfree and not other things as well. It is tricky
stuff.

I see you got an Ack for the infiniband peice, so feel free to send
that file to the linux-rdma list.

Jason