2019-05-20 14:32:58

by Michal Kubecek

[permalink] [raw]
Subject: [PATCH] mlx5: avoid 64-bit division

Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type")
breaks i386 build by introducing three 64-bit divisions. As the divisor
is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace
the division with bit operations.

Fixes: 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type")
Signed-off-by: Michal Kubecek <[email protected]>
---
drivers/infiniband/hw/mlx5/cmd.c | 9 +++++++--
drivers/infiniband/hw/mlx5/main.c | 2 +-
2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c
index e3ec79b8f7f5..6c8645033102 100644
--- a/drivers/infiniband/hw/mlx5/cmd.c
+++ b/drivers/infiniband/hw/mlx5/cmd.c
@@ -190,12 +190,12 @@ int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
u16 uid, phys_addr_t *addr, u32 *obj_id)
{
struct mlx5_core_dev *dev = dm->dev;
- u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev));
u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
u32 in[MLX5_ST_SZ_DW(create_sw_icm_in)] = {};
unsigned long *block_map;
u64 icm_start_addr;
u32 log_icm_size;
+ u32 num_blocks;
u32 max_blocks;
u64 block_idx;
void *sw_icm;
@@ -224,6 +224,8 @@ int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
return -EINVAL;
}

+ num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >>
+ MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
max_blocks = BIT(log_icm_size - MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
spin_lock(&dm->lock);
block_idx = bitmap_find_next_zero_area(block_map,
@@ -266,13 +268,16 @@ int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
u16 uid, phys_addr_t addr, u32 obj_id)
{
struct mlx5_core_dev *dev = dm->dev;
- u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev));
u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {};
unsigned long *block_map;
+ u32 num_blocks;
u64 start_idx;
int err;

+ num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >>
+ MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
+
switch (type) {
case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
start_idx =
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index abac70ad5c7c..340290b883fe 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2344,7 +2344,7 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx,
/* Allocation size must a multiple of the basic block size
* and a power of 2.
*/
- act_size = roundup(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));
+ act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));
act_size = roundup_pow_of_two(act_size);

dm->size = act_size;
--
2.21.0



2019-05-27 18:16:56

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] mlx5: avoid 64-bit division

On Mon, May 20, 2019 at 01:19:02PM +0200, Michal Kubecek wrote:
> Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type")
> breaks i386 build by introducing three 64-bit divisions. As the divisor
> is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace
> the division with bit operations.
>
> Fixes: 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type")
> Signed-off-by: Michal Kubecek <[email protected]>
> drivers/infiniband/hw/mlx5/cmd.c | 9 +++++++--
> drivers/infiniband/hw/mlx5/main.c | 2 +-
> 2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c
> index e3ec79b8f7f5..6c8645033102 100644
> +++ b/drivers/infiniband/hw/mlx5/cmd.c
> @@ -190,12 +190,12 @@ int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
> u16 uid, phys_addr_t *addr, u32 *obj_id)
> {
> struct mlx5_core_dev *dev = dm->dev;
> - u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev));
> u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
> u32 in[MLX5_ST_SZ_DW(create_sw_icm_in)] = {};
> unsigned long *block_map;
> u64 icm_start_addr;
> u32 log_icm_size;
> + u32 num_blocks;
> u32 max_blocks;
> u64 block_idx;
> void *sw_icm;
> @@ -224,6 +224,8 @@ int mlx5_cmd_alloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
> return -EINVAL;
> }
>
> + num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >>
> + MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
> max_blocks = BIT(log_icm_size - MLX5_LOG_SW_ICM_BLOCK_SIZE(dev));
> spin_lock(&dm->lock);
> block_idx = bitmap_find_next_zero_area(block_map,
> @@ -266,13 +268,16 @@ int mlx5_cmd_dealloc_sw_icm(struct mlx5_dm *dm, int type, u64 length,
> u16 uid, phys_addr_t addr, u32 obj_id)
> {
> struct mlx5_core_dev *dev = dm->dev;
> - u32 num_blocks = DIV_ROUND_UP(length, MLX5_SW_ICM_BLOCK_SIZE(dev));
> u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {};
> u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {};
> unsigned long *block_map;
> + u32 num_blocks;
> u64 start_idx;
> int err;
>
> + num_blocks = (length + MLX5_SW_ICM_BLOCK_SIZE(dev) - 1) >>
> + MLX5_LOG_SW_ICM_BLOCK_SIZE(dev);
> +
> switch (type) {
> case MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM:
> start_idx =
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index abac70ad5c7c..340290b883fe 100644
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -2344,7 +2344,7 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx,
> /* Allocation size must a multiple of the basic block size
> * and a power of 2.
> */
> - act_size = roundup(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));
> + act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));
> act_size = roundup_pow_of_two(act_size);

It is kind of weird that we have round_up and the bitshift
version.. None of this is performance critical so why not just use
round_up everywhere?

Ariel, it is true MLX5_SW_ICM_BLOCK_SIZE will always be a power of
two?

Jason

2019-05-27 20:50:09

by Michal Kubecek

[permalink] [raw]
Subject: Re: [PATCH] mlx5: avoid 64-bit division

On Mon, May 27, 2019 at 03:15:34PM -0300, Jason Gunthorpe wrote:
> On Mon, May 20, 2019 at 01:19:02PM +0200, Michal Kubecek wrote:
> > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > index abac70ad5c7c..340290b883fe 100644
> > +++ b/drivers/infiniband/hw/mlx5/main.c
> > @@ -2344,7 +2344,7 @@ static int handle_alloc_dm_sw_icm(struct ib_ucontext *ctx,
> > /* Allocation size must a multiple of the basic block size
> > * and a power of 2.
> > */
> > - act_size = roundup(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));
> > + act_size = round_up(attr->length, MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));
> > act_size = roundup_pow_of_two(act_size);
>
> It is kind of weird that we have round_up and the bitshift
> version.. None of this is performance critical so why not just use
> round_up everywhere?
>
> Ariel, it is true MLX5_SW_ICM_BLOCK_SIZE will always be a power of
> two?

If it weren't, the requirements from the comment above could never be
satisfied as a power of two can only be a multiple of another power of
two. Which also means that what the code above does is in fact
equivalent to

act_size = max_t(u64, roundup_pow_of_two(attr->length),
MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));

or

act_size = roundup_pow_of_two(max_t(u64, attr->length,
MLX5_SW_ICM_BLOCK_SIZE(dm_db->dev));

Michal Kubecek

2019-05-29 16:10:10

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] mlx5: avoid 64-bit division

On Mon, May 20, 2019 at 01:19:02PM +0200, Michal Kubecek wrote:
> Commit 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type")
> breaks i386 build by introducing three 64-bit divisions. As the divisor
> is MLX5_SW_ICM_BLOCK_SIZE() which is always a power of 2, we can replace
> the division with bit operations.
>
> Fixes: 25c13324d03d ("IB/mlx5: Add steering SW ICM device memory type")
> Signed-off-by: Michal Kubecek <[email protected]>
> Reviewed-by: Leon Romanovsky <[email protected]>
> ---
> drivers/infiniband/hw/mlx5/cmd.c | 9 +++++++--
> drivers/infiniband/hw/mlx5/main.c | 2 +-
> 2 files changed, 8 insertions(+), 3 deletions(-)

Applied to for-rc, thanks

Jason