2021-06-27 06:53:56

by Anand Khoje

[permalink] [raw]
Subject: [PATCH v6 for-next 0/2] IB/core: Obtaining subnet_prefix from cache in

This v6 patch series is used to read the port_attribute subnet_prefix
from a valid cache entry instead of having to call
device->ops.query_gid() for Infiniband link-layer devices in
__ib_query_port().

In the event of a cache update, the value for subnet_prefix gets read
using device->ops.query_gid() in config_non_roce_gid_cache().

Anand Khoje (2):
IB/core: Updating cache for subnet_prefix in
config_non_roce_gid_cache()
IB/core: Read subnet_prefix in ib_query_port via cache.
---
v1 -> v2:
- Split the v1 patch in 3 patches as per Leon's suggestion.

v2 -> v3:
- Added changes as per Mark Zhang's suggestion of clearing
flags in git_table_cleanup_one().
v3 -> v4:
- Removed the enum ib_port_data_flags and 8 byte flags from
struct ib_port_data, and the set_bit()/clear_bit() API
used to update this flag as that was not necessary.
Done to keep the code simple.
- Added code to read subnet_prefix from updated GID cache in the
event of cache update. Prior to this change, ib_cache_update
was reading the value for subnet_prefix via ib_query_port(),
due to this patch, we ended up reading a stale cached value of
subnet_prefix.
v4 -> v5:
- Removed the code to reset cache_is_initialised bit from cleanup
as per Leon's suggestion.
- Removed ib_cache_is_initialised() function.

v5 -> v6:
- Added changes as per Jason's suggestion of updating subnet_prefix
in config_non_roce_gid_cache() and removing the flag
cache_is_initialized in __ib_query_port().
---

drivers/infiniband/core/cache.c | 8 +++++---
drivers/infiniband/core/device.c | 7 ++-----
2 files changed, 7 insertions(+), 8 deletions(-)

--
1.8.3.1


2021-06-27 07:06:29

by Anand Khoje

[permalink] [raw]
Subject: [PATCH v6 for-next 1/2] IB/core: Updating cache for subnet_prefix in config_non_roce_gid_cache()

Currently cache for subnet_prefix is getting updated by reading port
attributes via ib_query_port. ib_query_port() calls ops.query_gid()
to get subnet_prefix and returns it via port_attr.

In ib_cache_update(), config_non_roce_gid_cache() obtains GIDs
by calling ops.query_gid(). We utilize this to store subnet_prefix
in cache.

Fixes: fad61ad ("IB/core: Add subnet prefix to port info")
Suggested-by: Jason Gunthorpe <[email protected]>
Suggested-by: Aru Kolappan <[email protected]>
Signed-off-by: Anand Khoje <[email protected]>
Signed-off-by: Haakon Bugge <[email protected]>
---
drivers/infiniband/core/cache.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index c9e9fc8..929399e 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1429,7 +1429,7 @@ int rdma_read_gid_l2_fields(const struct ib_gid_attr *attr,
EXPORT_SYMBOL(rdma_read_gid_l2_fields);

static int config_non_roce_gid_cache(struct ib_device *device,
- u32 port, int gid_tbl_len)
+ u32 port, struct ib_port_attr *tprops)
{
struct ib_gid_attr gid_attr = {};
struct ib_gid_table *table;
@@ -1441,7 +1441,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,
table = rdma_gid_table(device, port);

mutex_lock(&table->lock);
- for (i = 0; i < gid_tbl_len; ++i) {
+ for (i = 0; i < tprops->gid_tbl_len; ++i) {
if (!device->ops.query_gid)
continue;
ret = device->ops.query_gid(device, port, i, &gid_attr.gid);
@@ -1452,6 +1452,8 @@ static int config_non_roce_gid_cache(struct ib_device *device,
goto err;
}
gid_attr.index = i;
+ tprops->subnet_prefix =
+ be64_to_cpu(gid_attr.gid.global.subnet_prefix);
add_modify_gid(table, &gid_attr);
}
err:
@@ -1484,7 +1486,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,

if (!rdma_protocol_roce(device, port) && update_gids) {
ret = config_non_roce_gid_cache(device, port,
- tprops->gid_tbl_len);
+ tprops);
if (ret)
goto err;
}
--
1.8.3.1

2021-06-27 07:13:54

by Anand Khoje

[permalink] [raw]
Subject: [PATCH v6 for-next 2/2] IB/core: Read subnet_prefix in ib_query_port via cache.

ib_query_port() calls device->ops.query_port() to get the port
attributes. The method of querying is device driver specific.
The same function calls device->ops.query_gid() to get the GID and
extract the subnet_prefix (gid_prefix).

The GID and subnet_prefix are stored in a cache. But they do not get
read from the cache if the device is an Infiniband device. The
following change takes advantage of the cached subnet_prefix.
Testing with RDBMS has shown a significant improvement in performance
with this change.

Fixes: fad61ad ("IB/core: Add subnet prefix to port info")
Signed-off-by: Anand Khoje <[email protected]>
Signed-off-by: Haakon Bugge <[email protected]>
---
drivers/infiniband/core/device.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index fa20b18..9ad10c8 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2050,7 +2050,6 @@ static int __ib_query_port(struct ib_device *device,
u32 port_num,
struct ib_port_attr *port_attr)
{
- union ib_gid gid = {};
int err;

memset(port_attr, 0, sizeof(*port_attr));
@@ -2063,11 +2062,9 @@ static int __ib_query_port(struct ib_device *device,
IB_LINK_LAYER_INFINIBAND)
return 0;

- err = device->ops.query_gid(device, port_num, 0, &gid);
- if (err)
- return err;
+ ib_get_cached_subnet_prefix(device, port_num,
+ &port_attr->subnet_prefix);

- port_attr->subnet_prefix = be64_to_cpu(gid.global.subnet_prefix);
return 0;
}

--
1.8.3.1

2021-06-27 10:33:01

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH v6 for-next 0/2] IB/core: Obtaining subnet_prefix from cache in

On Sun, Jun 27, 2021 at 12:17:51PM +0530, Anand Khoje wrote:
> This v6 patch series is used to read the port_attribute subnet_prefix
> from a valid cache entry instead of having to call
> device->ops.query_gid() for Infiniband link-layer devices in
> __ib_query_port().
>
> In the event of a cache update, the value for subnet_prefix gets read
> using device->ops.query_gid() in config_non_roce_gid_cache().
>
> Anand Khoje (2):
> IB/core: Updating cache for subnet_prefix in
> config_non_roce_gid_cache()
> IB/core: Read subnet_prefix in ib_query_port via cache.

This series breaks mlx4/mlx5. You forgot to call to lock_init or
something like that.

[ 18.231107] INFO: trying to register non-static key.
[ 18.232775] The code is fine but needs lockdep annotation, or maybe
[ 18.247970] you didn't initialize this object before use?
[ 18.249458] turning off the locking correctness validator.
[ 18.250867] CPU: 6 PID: 333 Comm: systemd-udevd Not tainted 5.13.0-rc7_2021_06_27_07_56_03_ #1
[ 18.253147] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 18.255850] Call Trace:
[ 18.256649] dump_stack+0xa5/0xe6
[ 18.257625] register_lock_class+0x159a/0x1990
[ 18.258848] ? really_probe+0x217/0xd70
[ 18.260017] ? driver_probe_device+0x111/0x3c0
[ 18.261322] ? device_driver_attach+0x209/0x270
[ 18.261330] ? __driver_attach+0x133/0x280
[[0;32m OK [0m] Started [0;[ 18.261334] ? bus_for_each_dev+0x11e/0x1a0
1;39mNetwork Manager Script Dispatcher Service[[ 18.261338] ? bus_add_driver+0x309/0x580
0m.
[ 18.261347] ? is_dynamic_key+0x1d0/0x1d0
[ 18.261353] ? do_init_module+0x1c8/0x780
[ 18.261360] ? load_module+0x7131/0x9a50
[ 18.261364] ? __do_sys_finit_module+0x118/0x1b0
[ 18.261371] ? do_syscall_64+0x3f/0x80
[ 18.261377] ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 18.261384] ? check_chain_key+0x229/0x5b0
[ 18.261389] __lock_acquire+0x10a/0x5fe0
[ 18.261396] ? register_lock_class+0x1990/0x1990
[ 18.261400] ? kasan_quarantine_put+0xd1/0x1f0
[ 18.261405] ? trace_hardirqs_on+0x32/0x120
[ 18.261413] lock_acquire+0x1b1/0x700
[ 18.261419] ? ib_get_cached_subnet_prefix+0x28/0xd0 [ib_core]
[ 18.283989] ? lockdep_hardirqs_on_prepare+0x400/0x400
[ 18.285397] ? mlx5_ib_query_port+0x982/0x12b0 [mlx5_ib]
[ 18.286861] ? __kasan_kmalloc+0x7c/0x90
[ 18.287939] ? alloc_port_data.part.0+0xa6/0x380 [ib_core]
[ 18.289408] ? mlx5_ib_rep_query_port+0x10/0x10 [mlx5_ib]
[ 18.290861] ? auxiliary_bus_probe+0x9d/0xe0
[ 18.292016] ? really_probe+0x217/0xd70
[ 18.293113] ? driver_probe_device+0x111/0x3c0
[ 18.294311] ? device_driver_attach+0x1f4/0x270
[ 18.295588] ? __driver_attach+0x133/0x280
[ 18.296772] _raw_read_lock_irqsave+0x72/0x90
[ 18.297965] ? ib_get_cached_subnet_prefix+0x28/0xd0 [ib_core]
[ 18.299564] ib_get_cached_subnet_prefix+0x28/0xd0 [ib_core]
[ 18.301150] ib_query_port+0x388/0x6c0 [ib_core]
[ 18.303752] mlx5_port_immutable+0x3bf/0x4b0 [mlx5_ib]
[ 18.305691] ? mlx5_ib_stage_caps_init+0xa90/0xa90 [mlx5_ib]
[ 18.307757] ? rcu_read_lock_sched_held+0x3f/0x70
[ 18.309543] ? is_module_address+0x25/0x40
[ 18.311093] ? static_obj+0x8a/0xc0
[ 18.312496] ? alloc_port_data.part.0+0x1e6/0x380 [ib_core]
[ 18.314642] ib_register_device+0x36f/0x950 [ib_core]
[ 18.316585] ? enable_device_and_get+0x340/0x340 [ib_core]
[ 18.318701] ? do_raw_spin_unlock+0x54/0x220
[ 18.320373] ? _raw_spin_unlock+0x1f/0x30
[ 18.321971] __mlx5_ib_add+0x6c/0x140 [mlx5_ib]
[ 18.323737] mlx5r_probe+0x1f5/0x380 [mlx5_ib]
[ 18.325495] ? kernfs_add_one+0x2a3/0x400
[ 18.327085] ? __mlx5_ib_add+0x140/0x140 [mlx5_ib]
[ 18.329009] ? kernfs_create_link+0x16c/0x230
[ 18.330682] ? auxiliary_match_id+0xe9/0x140
[ 18.332352] ? __mlx5_ib_add+0x140/0x140 [mlx5_ib]
[ 18.334359] auxiliary_bus_probe+0x9d/0xe0
[ 18.336398] really_probe+0x217/0xd70
[ 18.338001] driver_probe_device+0x111/0x3c0
[ 18.339659] device_driver_attach+0x209/0x270
[ 18.341368] __driver_attach+0x133/0x280
[ 18.342892] ? device_driver_attach+0x270/0x270
[ 18.344651] bus_for_each_dev+0x11e/0x1a0
[ 18.346237] ? lockdep_init_map_type+0x2c3/0x790
[ 18.348016] ? subsys_dev_iter_exit+0x10/0x10
[ 18.349682] bus_add_driver+0x309/0x580
[ 18.351374] driver_register+0x1ee/0x380
[ 18.353003] mlx5_ib_init+0x100/0x15f [mlx5_ib]
[ 18.354745] ? 0xffffffffa07c0000
[ 18.356096] do_one_initcall+0xd5/0x430
[ 18.357618] ? trace_event_raw_event_initcall_finish+0x1c0/0x1c0
[ 18.359808] ? rcu_read_lock_sched_held+0x3f/0x70
[ 18.361638] ? kasan_unpoison+0x23/0x50
[ 18.363124] do_init_module+0x1c8/0x780
[ 18.364659] load_module+0x7131/0x9a50
[ 18.366166] ? module_frob_arch_sections+0x20/0x20
[ 18.368037] ? lock_downgrade+0x6e0/0x6e0
[ 18.369645] ? kernel_read_file+0x284/0x6a0
[ 18.371279] ? __x64_sys_fsconfig+0x980/0x980
[ 18.373007] ? kernel_read_file_from_fd+0x4b/0x90
[ 18.374769] __do_sys_finit_module+0x118/0x1b0
[ 18.376506] ? __do_sys_init_module+0x250/0x250
[ 18.378234] ? seccomp_do_user_notification.constprop.0+0xb30/0xb30
[ 18.380549] do_syscall_64+0x3f/0x80
[ 18.382002] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 18.383968] RIP: 0033:0x7f52e7cac13d
[ 18.385444] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 6d 0c 00 f7 d8 64 89 01 48
[ 18.392025] RSP: 002b:00007fff8a5ab448 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 18.394893] RAX: ffffffffffffffda RBX: 0000563420129ce0 RCX: 00007f52e7cac13d
[ 18.397513] RDX: 0000000000000000 RSI: 00007f52e790995d RDI: 000000000000000f
[ 18.400106] RBP: 0000000000020000 R08: 0000000000000000 R09: 00005634200e5f10
[ 18.402670] R10: 000000000000000f R11: 0000000000000246 R12: 0000000000000000
[ 18.405185] R13: 00007f52e790995d R14: 00005634201261c0 R15: 00005634201245f0




> ---
> v1 -> v2:
> - Split the v1 patch in 3 patches as per Leon's suggestion.
>
> v2 -> v3:
> - Added changes as per Mark Zhang's suggestion of clearing
> flags in git_table_cleanup_one().
> v3 -> v4:
> - Removed the enum ib_port_data_flags and 8 byte flags from
> struct ib_port_data, and the set_bit()/clear_bit() API
> used to update this flag as that was not necessary.
> Done to keep the code simple.
> - Added code to read subnet_prefix from updated GID cache in the
> event of cache update. Prior to this change, ib_cache_update
> was reading the value for subnet_prefix via ib_query_port(),
> due to this patch, we ended up reading a stale cached value of
> subnet_prefix.
> v4 -> v5:
> - Removed the code to reset cache_is_initialised bit from cleanup
> as per Leon's suggestion.
> - Removed ib_cache_is_initialised() function.
>
> v5 -> v6:
> - Added changes as per Jason's suggestion of updating subnet_prefix
> in config_non_roce_gid_cache() and removing the flag
> cache_is_initialized in __ib_query_port().
> ---
>
> drivers/infiniband/core/cache.c | 8 +++++---
> drivers/infiniband/core/device.c | 7 ++-----
> 2 files changed, 7 insertions(+), 8 deletions(-)
>
> --
> 1.8.3.1
>

2021-06-27 10:50:36

by Håkon Bugge

[permalink] [raw]
Subject: Re: [PATCH v6 for-next 0/2] IB/core: Obtaining subnet_prefix from cache in



> On 27 Jun 2021, at 12:31, Leon Romanovsky <[email protected]> wrote:
>
> On Sun, Jun 27, 2021 at 12:17:51PM +0530, Anand Khoje wrote:
>> This v6 patch series is used to read the port_attribute subnet_prefix
>> from a valid cache entry instead of having to call
>> device->ops.query_gid() for Infiniband link-layer devices in
>> __ib_query_port().
>>
>> In the event of a cache update, the value for subnet_prefix gets read
>> using device->ops.query_gid() in config_non_roce_gid_cache().
>>
>> Anand Khoje (2):
>> IB/core: Updating cache for subnet_prefix in
>> config_non_roce_gid_cache()
>> IB/core: Read subnet_prefix in ib_query_port via cache.
>
> This series breaks mlx4/mlx5. You forgot to call to lock_init or
> something like that.

Thanks for catching!

Sure, in ib_register_device(), setup_device() (which ends up calling __ib_query_port()) is called before ib_cache_setup_one(). Can these two calls have their order swapped?

If not, we need to get back to the cache_is_initialized flag.


Thxs, Håkon

>
> [ 18.231107] INFO: trying to register non-static key.
> [ 18.232775] The code is fine but needs lockdep annotation, or maybe
> [ 18.247970] you didn't initialize this object before use?
> [ 18.249458] turning off the locking correctness validator.
> [ 18.250867] CPU: 6 PID: 333 Comm: systemd-udevd Not tainted 5.13.0-rc7_2021_06_27_07_56_03_ #1
> [ 18.253147] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [ 18.255850] Call Trace:
> [ 18.256649] dump_stack+0xa5/0xe6
> [ 18.257625] register_lock_class+0x159a/0x1990
> [ 18.258848] ? really_probe+0x217/0xd70
> [ 18.260017] ? driver_probe_device+0x111/0x3c0
> [ 18.261322] ? device_driver_attach+0x209/0x270
> [ 18.261330] ? __driver_attach+0x133/0x280
> [[0;32m OK [0m] Started [0;[ 18.261334] ? bus_for_each_dev+0x11e/0x1a0
> 1;39mNetwork Manager Script Dispatcher Service[[ 18.261338] ? bus_add_driver+0x309/0x580
> 0m.
> [ 18.261347] ? is_dynamic_key+0x1d0/0x1d0
> [ 18.261353] ? do_init_module+0x1c8/0x780
> [ 18.261360] ? load_module+0x7131/0x9a50
> [ 18.261364] ? __do_sys_finit_module+0x118/0x1b0
> [ 18.261371] ? do_syscall_64+0x3f/0x80
> [ 18.261377] ? entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 18.261384] ? check_chain_key+0x229/0x5b0
> [ 18.261389] __lock_acquire+0x10a/0x5fe0
> [ 18.261396] ? register_lock_class+0x1990/0x1990
> [ 18.261400] ? kasan_quarantine_put+0xd1/0x1f0
> [ 18.261405] ? trace_hardirqs_on+0x32/0x120
> [ 18.261413] lock_acquire+0x1b1/0x700
> [ 18.261419] ? ib_get_cached_subnet_prefix+0x28/0xd0 [ib_core]
> [ 18.283989] ? lockdep_hardirqs_on_prepare+0x400/0x400
> [ 18.285397] ? mlx5_ib_query_port+0x982/0x12b0 [mlx5_ib]
> [ 18.286861] ? __kasan_kmalloc+0x7c/0x90
> [ 18.287939] ? alloc_port_data.part.0+0xa6/0x380 [ib_core]
> [ 18.289408] ? mlx5_ib_rep_query_port+0x10/0x10 [mlx5_ib]
> [ 18.290861] ? auxiliary_bus_probe+0x9d/0xe0
> [ 18.292016] ? really_probe+0x217/0xd70
> [ 18.293113] ? driver_probe_device+0x111/0x3c0
> [ 18.294311] ? device_driver_attach+0x1f4/0x270
> [ 18.295588] ? __driver_attach+0x133/0x280
> [ 18.296772] _raw_read_lock_irqsave+0x72/0x90
> [ 18.297965] ? ib_get_cached_subnet_prefix+0x28/0xd0 [ib_core]
> [ 18.299564] ib_get_cached_subnet_prefix+0x28/0xd0 [ib_core]
> [ 18.301150] ib_query_port+0x388/0x6c0 [ib_core]
> [ 18.303752] mlx5_port_immutable+0x3bf/0x4b0 [mlx5_ib]
> [ 18.305691] ? mlx5_ib_stage_caps_init+0xa90/0xa90 [mlx5_ib]
> [ 18.307757] ? rcu_read_lock_sched_held+0x3f/0x70
> [ 18.309543] ? is_module_address+0x25/0x40
> [ 18.311093] ? static_obj+0x8a/0xc0
> [ 18.312496] ? alloc_port_data.part.0+0x1e6/0x380 [ib_core]
> [ 18.314642] ib_register_device+0x36f/0x950 [ib_core]
> [ 18.316585] ? enable_device_and_get+0x340/0x340 [ib_core]
> [ 18.318701] ? do_raw_spin_unlock+0x54/0x220
> [ 18.320373] ? _raw_spin_unlock+0x1f/0x30
> [ 18.321971] __mlx5_ib_add+0x6c/0x140 [mlx5_ib]
> [ 18.323737] mlx5r_probe+0x1f5/0x380 [mlx5_ib]
> [ 18.325495] ? kernfs_add_one+0x2a3/0x400
> [ 18.327085] ? __mlx5_ib_add+0x140/0x140 [mlx5_ib]
> [ 18.329009] ? kernfs_create_link+0x16c/0x230
> [ 18.330682] ? auxiliary_match_id+0xe9/0x140
> [ 18.332352] ? __mlx5_ib_add+0x140/0x140 [mlx5_ib]
> [ 18.334359] auxiliary_bus_probe+0x9d/0xe0
> [ 18.336398] really_probe+0x217/0xd70
> [ 18.338001] driver_probe_device+0x111/0x3c0
> [ 18.339659] device_driver_attach+0x209/0x270
> [ 18.341368] __driver_attach+0x133/0x280
> [ 18.342892] ? device_driver_attach+0x270/0x270
> [ 18.344651] bus_for_each_dev+0x11e/0x1a0
> [ 18.346237] ? lockdep_init_map_type+0x2c3/0x790
> [ 18.348016] ? subsys_dev_iter_exit+0x10/0x10
> [ 18.349682] bus_add_driver+0x309/0x580
> [ 18.351374] driver_register+0x1ee/0x380
> [ 18.353003] mlx5_ib_init+0x100/0x15f [mlx5_ib]
> [ 18.354745] ? 0xffffffffa07c0000
> [ 18.356096] do_one_initcall+0xd5/0x430
> [ 18.357618] ? trace_event_raw_event_initcall_finish+0x1c0/0x1c0
> [ 18.359808] ? rcu_read_lock_sched_held+0x3f/0x70
> [ 18.361638] ? kasan_unpoison+0x23/0x50
> [ 18.363124] do_init_module+0x1c8/0x780
> [ 18.364659] load_module+0x7131/0x9a50
> [ 18.366166] ? module_frob_arch_sections+0x20/0x20
> [ 18.368037] ? lock_downgrade+0x6e0/0x6e0
> [ 18.369645] ? kernel_read_file+0x284/0x6a0
> [ 18.371279] ? __x64_sys_fsconfig+0x980/0x980
> [ 18.373007] ? kernel_read_file_from_fd+0x4b/0x90
> [ 18.374769] __do_sys_finit_module+0x118/0x1b0
> [ 18.376506] ? __do_sys_init_module+0x250/0x250
> [ 18.378234] ? seccomp_do_user_notification.constprop.0+0xb30/0xb30
> [ 18.380549] do_syscall_64+0x3f/0x80
> [ 18.382002] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 18.383968] RIP: 0033:0x7f52e7cac13d
> [ 18.385444] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 6d 0c 00 f7 d8 64 89 01 48
> [ 18.392025] RSP: 002b:00007fff8a5ab448 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> [ 18.394893] RAX: ffffffffffffffda RBX: 0000563420129ce0 RCX: 00007f52e7cac13d
> [ 18.397513] RDX: 0000000000000000 RSI: 00007f52e790995d RDI: 000000000000000f
> [ 18.400106] RBP: 0000000000020000 R08: 0000000000000000 R09: 00005634200e5f10
> [ 18.402670] R10: 000000000000000f R11: 0000000000000246 R12: 0000000000000000
> [ 18.405185] R13: 00007f52e790995d R14: 00005634201261c0 R15: 00005634201245f0
>
>
>
>
>> ---
>> v1 -> v2:
>> - Split the v1 patch in 3 patches as per Leon's suggestion.
>>
>> v2 -> v3:
>> - Added changes as per Mark Zhang's suggestion of clearing
>> flags in git_table_cleanup_one().
>> v3 -> v4:
>> - Removed the enum ib_port_data_flags and 8 byte flags from
>> struct ib_port_data, and the set_bit()/clear_bit() API
>> used to update this flag as that was not necessary.
>> Done to keep the code simple.
>> - Added code to read subnet_prefix from updated GID cache in the
>> event of cache update. Prior to this change, ib_cache_update
>> was reading the value for subnet_prefix via ib_query_port(),
>> due to this patch, we ended up reading a stale cached value of
>> subnet_prefix.
>> v4 -> v5:
>> - Removed the code to reset cache_is_initialised bit from cleanup
>> as per Leon's suggestion.
>> - Removed ib_cache_is_initialised() function.
>>
>> v5 -> v6:
>> - Added changes as per Jason's suggestion of updating subnet_prefix
>> in config_non_roce_gid_cache() and removing the flag
>> cache_is_initialized in __ib_query_port().
>> ---
>>
>> drivers/infiniband/core/cache.c | 8 +++++---
>> drivers/infiniband/core/device.c | 7 ++-----
>> 2 files changed, 7 insertions(+), 8 deletions(-)
>>
>> --
>> 1.8.3.1

2021-06-27 11:37:54

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [PATCH v6 for-next 0/2] IB/core: Obtaining subnet_prefix from cache in

On Sun, Jun 27, 2021 at 10:40:43AM +0000, Haakon Bugge wrote:
>
>
> > On 27 Jun 2021, at 12:31, Leon Romanovsky <[email protected]> wrote:
> >
> > On Sun, Jun 27, 2021 at 12:17:51PM +0530, Anand Khoje wrote:
> >> This v6 patch series is used to read the port_attribute subnet_prefix
> >> from a valid cache entry instead of having to call
> >> device->ops.query_gid() for Infiniband link-layer devices in
> >> __ib_query_port().
> >>
> >> In the event of a cache update, the value for subnet_prefix gets read
> >> using device->ops.query_gid() in config_non_roce_gid_cache().
> >>
> >> Anand Khoje (2):
> >> IB/core: Updating cache for subnet_prefix in
> >> config_non_roce_gid_cache()
> >> IB/core: Read subnet_prefix in ib_query_port via cache.
> >
> > This series breaks mlx4/mlx5. You forgot to call to lock_init or
> > something like that.
>
> Thanks for catching!
>
> Sure, in ib_register_device(), setup_device() (which ends up calling __ib_query_port()) is called before ib_cache_setup_one(). Can these two calls have their order swapped?

I don't think so, if I didn't miss anything, we are relying in gid_table_setup_one()
on some properties from setup_device().

>
> If not, we need to get back to the cache_is_initialized flag.
>
>
> Thxs, H?kon
>
> >
> > [ 18.231107] INFO: trying to register non-static key.
> > [ 18.232775] The code is fine but needs lockdep annotation, or maybe
> > [ 18.247970] you didn't initialize this object before use?
> > [ 18.249458] turning off the locking correctness validator.
> > [ 18.250867] CPU: 6 PID: 333 Comm: systemd-udevd Not tainted 5.13.0-rc7_2021_06_27_07_56_03_ #1
> > [ 18.253147] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> > [ 18.255850] Call Trace:
> > [ 18.256649] dump_stack+0xa5/0xe6
> > [ 18.257625] register_lock_class+0x159a/0x1990
> > [ 18.258848] ? really_probe+0x217/0xd70
> > [ 18.260017] ? driver_probe_device+0x111/0x3c0
> > [ 18.261322] ? device_driver_attach+0x209/0x270
> > [ 18.261330] ? __driver_attach+0x133/0x280
> > [[0;32m OK [0m] Started [0;[ 18.261334] ? bus_for_each_dev+0x11e/0x1a0
> > 1;39mNetwork Manager Script Dispatcher Service[[ 18.261338] ? bus_add_driver+0x309/0x580
> > 0m.
> > [ 18.261347] ? is_dynamic_key+0x1d0/0x1d0
> > [ 18.261353] ? do_init_module+0x1c8/0x780
> > [ 18.261360] ? load_module+0x7131/0x9a50
> > [ 18.261364] ? __do_sys_finit_module+0x118/0x1b0
> > [ 18.261371] ? do_syscall_64+0x3f/0x80
> > [ 18.261377] ? entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [ 18.261384] ? check_chain_key+0x229/0x5b0
> > [ 18.261389] __lock_acquire+0x10a/0x5fe0
> > [ 18.261396] ? register_lock_class+0x1990/0x1990
> > [ 18.261400] ? kasan_quarantine_put+0xd1/0x1f0
> > [ 18.261405] ? trace_hardirqs_on+0x32/0x120
> > [ 18.261413] lock_acquire+0x1b1/0x700
> > [ 18.261419] ? ib_get_cached_subnet_prefix+0x28/0xd0 [ib_core]
> > [ 18.283989] ? lockdep_hardirqs_on_prepare+0x400/0x400
> > [ 18.285397] ? mlx5_ib_query_port+0x982/0x12b0 [mlx5_ib]
> > [ 18.286861] ? __kasan_kmalloc+0x7c/0x90
> > [ 18.287939] ? alloc_port_data.part.0+0xa6/0x380 [ib_core]
> > [ 18.289408] ? mlx5_ib_rep_query_port+0x10/0x10 [mlx5_ib]
> > [ 18.290861] ? auxiliary_bus_probe+0x9d/0xe0
> > [ 18.292016] ? really_probe+0x217/0xd70
> > [ 18.293113] ? driver_probe_device+0x111/0x3c0
> > [ 18.294311] ? device_driver_attach+0x1f4/0x270
> > [ 18.295588] ? __driver_attach+0x133/0x280
> > [ 18.296772] _raw_read_lock_irqsave+0x72/0x90
> > [ 18.297965] ? ib_get_cached_subnet_prefix+0x28/0xd0 [ib_core]
> > [ 18.299564] ib_get_cached_subnet_prefix+0x28/0xd0 [ib_core]
> > [ 18.301150] ib_query_port+0x388/0x6c0 [ib_core]
> > [ 18.303752] mlx5_port_immutable+0x3bf/0x4b0 [mlx5_ib]
> > [ 18.305691] ? mlx5_ib_stage_caps_init+0xa90/0xa90 [mlx5_ib]
> > [ 18.307757] ? rcu_read_lock_sched_held+0x3f/0x70
> > [ 18.309543] ? is_module_address+0x25/0x40
> > [ 18.311093] ? static_obj+0x8a/0xc0
> > [ 18.312496] ? alloc_port_data.part.0+0x1e6/0x380 [ib_core]
> > [ 18.314642] ib_register_device+0x36f/0x950 [ib_core]
> > [ 18.316585] ? enable_device_and_get+0x340/0x340 [ib_core]
> > [ 18.318701] ? do_raw_spin_unlock+0x54/0x220
> > [ 18.320373] ? _raw_spin_unlock+0x1f/0x30
> > [ 18.321971] __mlx5_ib_add+0x6c/0x140 [mlx5_ib]
> > [ 18.323737] mlx5r_probe+0x1f5/0x380 [mlx5_ib]
> > [ 18.325495] ? kernfs_add_one+0x2a3/0x400
> > [ 18.327085] ? __mlx5_ib_add+0x140/0x140 [mlx5_ib]
> > [ 18.329009] ? kernfs_create_link+0x16c/0x230
> > [ 18.330682] ? auxiliary_match_id+0xe9/0x140
> > [ 18.332352] ? __mlx5_ib_add+0x140/0x140 [mlx5_ib]
> > [ 18.334359] auxiliary_bus_probe+0x9d/0xe0
> > [ 18.336398] really_probe+0x217/0xd70
> > [ 18.338001] driver_probe_device+0x111/0x3c0
> > [ 18.339659] device_driver_attach+0x209/0x270
> > [ 18.341368] __driver_attach+0x133/0x280
> > [ 18.342892] ? device_driver_attach+0x270/0x270
> > [ 18.344651] bus_for_each_dev+0x11e/0x1a0
> > [ 18.346237] ? lockdep_init_map_type+0x2c3/0x790
> > [ 18.348016] ? subsys_dev_iter_exit+0x10/0x10
> > [ 18.349682] bus_add_driver+0x309/0x580
> > [ 18.351374] driver_register+0x1ee/0x380
> > [ 18.353003] mlx5_ib_init+0x100/0x15f [mlx5_ib]
> > [ 18.354745] ? 0xffffffffa07c0000
> > [ 18.356096] do_one_initcall+0xd5/0x430
> > [ 18.357618] ? trace_event_raw_event_initcall_finish+0x1c0/0x1c0
> > [ 18.359808] ? rcu_read_lock_sched_held+0x3f/0x70
> > [ 18.361638] ? kasan_unpoison+0x23/0x50
> > [ 18.363124] do_init_module+0x1c8/0x780
> > [ 18.364659] load_module+0x7131/0x9a50
> > [ 18.366166] ? module_frob_arch_sections+0x20/0x20
> > [ 18.368037] ? lock_downgrade+0x6e0/0x6e0
> > [ 18.369645] ? kernel_read_file+0x284/0x6a0
> > [ 18.371279] ? __x64_sys_fsconfig+0x980/0x980
> > [ 18.373007] ? kernel_read_file_from_fd+0x4b/0x90
> > [ 18.374769] __do_sys_finit_module+0x118/0x1b0
> > [ 18.376506] ? __do_sys_init_module+0x250/0x250
> > [ 18.378234] ? seccomp_do_user_notification.constprop.0+0xb30/0xb30
> > [ 18.380549] do_syscall_64+0x3f/0x80
> > [ 18.382002] entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [ 18.383968] RIP: 0033:0x7f52e7cac13d
> > [ 18.385444] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2b 6d 0c 00 f7 d8 64 89 01 48
> > [ 18.392025] RSP: 002b:00007fff8a5ab448 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> > [ 18.394893] RAX: ffffffffffffffda RBX: 0000563420129ce0 RCX: 00007f52e7cac13d
> > [ 18.397513] RDX: 0000000000000000 RSI: 00007f52e790995d RDI: 000000000000000f
> > [ 18.400106] RBP: 0000000000020000 R08: 0000000000000000 R09: 00005634200e5f10
> > [ 18.402670] R10: 000000000000000f R11: 0000000000000246 R12: 0000000000000000
> > [ 18.405185] R13: 00007f52e790995d R14: 00005634201261c0 R15: 00005634201245f0
> >
> >
> >
> >
> >> ---
> >> v1 -> v2:
> >> - Split the v1 patch in 3 patches as per Leon's suggestion.
> >>
> >> v2 -> v3:
> >> - Added changes as per Mark Zhang's suggestion of clearing
> >> flags in git_table_cleanup_one().
> >> v3 -> v4:
> >> - Removed the enum ib_port_data_flags and 8 byte flags from
> >> struct ib_port_data, and the set_bit()/clear_bit() API
> >> used to update this flag as that was not necessary.
> >> Done to keep the code simple.
> >> - Added code to read subnet_prefix from updated GID cache in the
> >> event of cache update. Prior to this change, ib_cache_update
> >> was reading the value for subnet_prefix via ib_query_port(),
> >> due to this patch, we ended up reading a stale cached value of
> >> subnet_prefix.
> >> v4 -> v5:
> >> - Removed the code to reset cache_is_initialised bit from cleanup
> >> as per Leon's suggestion.
> >> - Removed ib_cache_is_initialised() function.
> >>
> >> v5 -> v6:
> >> - Added changes as per Jason's suggestion of updating subnet_prefix
> >> in config_non_roce_gid_cache() and removing the flag
> >> cache_is_initialized in __ib_query_port().
> >> ---
> >>
> >> drivers/infiniband/core/cache.c | 8 +++++---
> >> drivers/infiniband/core/device.c | 7 ++-----
> >> 2 files changed, 7 insertions(+), 8 deletions(-)
> >>
> >> --
> >> 1.8.3.1
>

2021-06-28 23:38:58

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v6 for-next 0/2] IB/core: Obtaining subnet_prefix from cache in

On Sun, Jun 27, 2021 at 02:33:31PM +0300, Leon Romanovsky wrote:
> On Sun, Jun 27, 2021 at 10:40:43AM +0000, Haakon Bugge wrote:
> >
> >
> > > On 27 Jun 2021, at 12:31, Leon Romanovsky <[email protected]> wrote:
> > >
> > > On Sun, Jun 27, 2021 at 12:17:51PM +0530, Anand Khoje wrote:
> > >> This v6 patch series is used to read the port_attribute subnet_prefix
> > >> from a valid cache entry instead of having to call
> > >> device->ops.query_gid() for Infiniband link-layer devices in
> > >> __ib_query_port().
> > >>
> > >> In the event of a cache update, the value for subnet_prefix gets read
> > >> using device->ops.query_gid() in config_non_roce_gid_cache().
> > >>
> > >> Anand Khoje (2):
> > >> IB/core: Updating cache for subnet_prefix in
> > >> config_non_roce_gid_cache()
> > >> IB/core: Read subnet_prefix in ib_query_port via cache.
> > >
> > > This series breaks mlx4/mlx5. You forgot to call to lock_init or
> > > something like that.
> >
> > Thanks for catching!
> >
> > Sure, in ib_register_device(), setup_device() (which ends up calling __ib_query_port()) is called before ib_cache_setup_one(). Can these two calls have their order swapped?
>
> I don't think so, if I didn't miss anything, we are relying in gid_table_setup_one()
> on some properties from setup_device().

Just reorder things enough so that the cache_lock is setup earlier, it
has no business being in cache_setup_one anyhow.

Jason