LinuxLists.cc - RE: [PATCH RFC] IB/mlx5: Reduce max order of memory allocated for xlt update

2021-02-22 16:30:41

Subject: RE: [PATCH RFC] IB/mlx5: Reduce max order of memory allocated for xlt update

Ping!

-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: 12 February 2021 07:26 PM
To: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]
Cc: Praveen Kannoju <[email protected]>; Rama Nichanamatlu <[email protected]>; Rajesh Sivaramasubramaniom <[email protected]>
Subject: [PATCH RFC] IB/mlx5: Reduce max order of memory allocated for xlt update

To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
1 MB (order-8) memory, depending on the size of the MR. This costly allocation can sometimes take very long to return (a few seconds), especially if the system is fragmented and does not have any free chunks for orders >= 3. This causes the calling application to hang for a long time. To avoid these long latency spikes, limit max order of allocation to order 3, and reuse that buffer to populate_xlt() for that MR. This will increase the latency slightly (in the order of microseconds) for each
mlx5_ib_update_xlt() call, especially for larger MRs (since we’re making multiple calls to populate_xlt()), but it’s a small price to pay to avoid the large latency spikes with higher order allocations.

Signed-off-by: Praveen Kumar Kannoju <[email protected]>
---
drivers/infiniband/hw/mlx5/mr.c | 20 ++------------------
1 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 24f8d59..4f33127 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -986,9 +986,7 @@ static void set_mr_fields(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,
return mr;
}

-#define MLX5_MAX_UMR_CHUNK ((1 << (MLX5_MAX_UMR_SHIFT + 4)) - \
- MLX5_UMR_MTT_ALIGNMENT)
-#define MLX5_SPARE_UMR_CHUNK 0x10000
+#define MLX5_SPARE_UMR_CHUNK 0x8000

/*
* Allocate a temporary buffer to hold the per-page information to transfer to @@ -1012,28 +1010,14 @@ static void set_mr_fields(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr,

gfp_mask |= __GFP_ZERO;

- /*
- * If the system already has a suitable high order page then just use
- * that, but don't try hard to create one. This max is about 1M, so a
- * free x86 huge page will satisfy it.
- */
size = min_t(size_t, ent_size * ALIGN(*nents, xlt_chunk_align),
- MLX5_MAX_UMR_CHUNK);
+ MLX5_SPARE_UMR_CHUNK);
*nents = size / ent_size;
res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
get_order(size));
if (res)
return res;

- if (size > MLX5_SPARE_UMR_CHUNK) {
- size = MLX5_SPARE_UMR_CHUNK;
- *nents = get_order(size) / ent_size;
- res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
- get_order(size));
- if (res)
- return res;
- }
-
*nents = PAGE_SIZE / ent_size;
res = (void *)__get_free_page(gfp_mask);
if (res)
--
1.7.1

2021-02-22 16:50:37

by Jason Gunthorpe

[permalink] [raw]

Subject: Re: [PATCH RFC] IB/mlx5: Reduce max order of memory allocated for xlt update

On Mon, Feb 22, 2021 at 04:26:23PM +0000, Praveen Kannoju wrote:
> Ping!

Your original message didn't make it to the mailing list or
patchworks, you will need to fix your mailing environment and resend
it.

> - /*
> - * If the system already has a suitable high order page then just use
> - * that, but don't try hard to create one. This max is about 1M, so a
> - * free x86 huge page will satisfy it.
> - */
> size = min_t(size_t, ent_size * ALIGN(*nents, xlt_chunk_align),
> - MLX5_MAX_UMR_CHUNK);
> + MLX5_SPARE_UMR_CHUNK);
> *nents = size / ent_size;
> res = (void *)__get_free_pages(gfp_mask | __GFP_NOWARN,
> get_order(size));

IIRC, there is some GFP flag here that fails fast if the order is not
available, why not just use that?

Jason