From: Anton Blanchard <antonb at au1.ibm.com>
To improve performance of driver ressource allocation,
replace the vmalloc() call with kmalloc().
Signed-off-by: Stefan Roscher <stefan.roscher at de.ibm.com>
---
drivers/infiniband/hw/ehca/ipz_pt_fn.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
index c3a3284..a260559 100644
--- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
@@ -220,7 +220,7 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue *queue,
queue->small_page = NULL;
/* allocate queue page pointers */
- queue->queue_pages = vmalloc(nr_of_pages * sizeof(void *));
+ queue->queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
if (!queue->queue_pages) {
ehca_gen_err("Couldn't allocate queue page list");
return 0;
@@ -240,7 +240,7 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue *queue,
ipz_queue_ctor_exit0:
ehca_gen_err("Couldn't alloc pages queue=%p "
"nr_of_pages=%x", queue, nr_of_pages);
- vfree(queue->queue_pages);
+ kfree(queue->queue_pages);
return 0;
}
@@ -262,7 +262,7 @@ int ipz_queue_dtor(struct ehca_pd *pd, struct ipz_queue *queue)
free_page((unsigned long)queue->queue_pages[i]);
}
- vfree(queue->queue_pages);
+ kfree(queue->queue_pages);
return 1;
}
--
1.5.5
> + queue->queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
How big might this buffer be? Any chance of allocation failure due to
memory fragmentation?
- R.
In case of large queue pairs there is the possibillity of allocation failures
due to memory fragmentationo with kmalloc().To ensure the memory is allocated even
if kmalloc() can not find chunks which are big enough, we try to allocate the memory
with vmalloc().
Signed-off-by: Stefan Roscher <[email protected]>
---
On Tuesday 21 April 2009 07:34:30 pm Roland Dreier wrote:
> > + queue->queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
>
> How big might this buffer be? Any chance of allocation failure due to
> memory fragmentation?
>
> - R.
Hey Roland,
yes you are right and here is the patch to circumvent the described problem.
It will apply on top of the patchset.
regards Stefan
drivers/infiniband/hw/ehca/ipz_pt_fn.c | 17 +++++++++++++----
1 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
index a260559..1227c59 100644
--- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
@@ -222,8 +222,11 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue *queue,
/* allocate queue page pointers */
queue->queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
if (!queue->queue_pages) {
- ehca_gen_err("Couldn't allocate queue page list");
- return 0;
+ queue->queue_pages = vmalloc(nr_of_pages * sizeof(void *));
+ if (!queue->queue_pages) {
+ ehca_gen_err("Couldn't allocate queue page list");
+ return 0;
+ }
}
memset(queue->queue_pages, 0, nr_of_pages * sizeof(void *));
@@ -240,7 +243,10 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue *queue,
ipz_queue_ctor_exit0:
ehca_gen_err("Couldn't alloc pages queue=%p "
"nr_of_pages=%x", queue, nr_of_pages);
- kfree(queue->queue_pages);
+ if (is_vmalloc_addr(queue->queue_pages))
+ vfree(queue->queue_pages);
+ else
+ kfree(queue->queue_pages);
return 0;
}
@@ -262,7 +268,10 @@ int ipz_queue_dtor(struct ehca_pd *pd, struct ipz_queue *queue)
free_page((unsigned long)queue->queue_pages[i]);
}
- kfree(queue->queue_pages);
+ if (is_vmalloc_addr(queue->queue_pages))
+ vfree(queue->queue_pages);
+ else
+ kfree(queue->queue_pages);
return 1;
}
--
1.5.5
On Wednesday 22 April 2009 04:10:18 pm michael wrote:
> Hi,
>
> I don't take the point, if it is not import use the vmalloc. Why you try
> with a kmalloc
> alloc first? and why do not use kzalloc?
Because kmalloc() is faster than vmalloc() causing a huge performance win
when someone allocates a large number of queue pairs. We fall back to
vmalloc() only if kmalloc() can't deliver the memory chunk.
We don't need kzalloc because we fill the list right after the alloc.
regards Stefan
Hi,
Stefan Roscher wrote:
> On Wednesday 22 April 2009 04:10:18 pm michael wrote:
>
>> Hi,
>>
>>
>
>
>> I don't take the point, if it is not import use the vmalloc. Why you try
>> with a kmalloc
>> alloc first? and why do not use kzalloc?
>>
>
> Because kmalloc() is faster than vmalloc() causing a huge performance win
> when someone allocates a large number of queue pairs. We fall back to
> vmalloc() only if kmalloc() can't deliver the memory chunk.
>
Sorry I catch later the performace issue.
> We don't need kzalloc because we fill the list right after the alloc.
>
> regards Stefan
>
Regards Michael
> _______________________________________________
> Linuxppc-dev mailing list
> [email protected]
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
>
>
Hi,
Stefan Roscher wrote:
> In case of large queue pairs there is the possibillity of allocation failures
> due to memory fragmentationo with kmalloc().To ensure the memory is allocated even
> if kmalloc() can not find chunks which are big enough, we try to allocate the memory
> with vmalloc().
>
> Signed-off-by: Stefan Roscher <[email protected]>
> ---
>
> On Tuesday 21 April 2009 07:34:30 pm Roland Dreier wrote:
>
>> > + queue->queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
>>
>> How big might this buffer be? Any chance of allocation failure due to
>> memory fragmentation?
>>
>> - R.
>>
> Hey Roland,
> yes you are right and here is the patch to circumvent the described problem.
> It will apply on top of the patchset.
> regards Stefan
>
>
I don't take the point, if it is not import use the vmalloc. Why you try
with a kmalloc
alloc first? and why do not use kzalloc?
>
> drivers/infiniband/hw/ehca/ipz_pt_fn.c | 17 +++++++++++++----
> 1 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
> index a260559..1227c59 100644
> --- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c
> +++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
> @@ -222,8 +222,11 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue *queue,
> /* allocate queue page pointers */
> queue->queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
> if (!queue->queue_pages) {
> - ehca_gen_err("Couldn't allocate queue page list");
> - return 0;
> + queue->queue_pages = vmalloc(nr_of_pages * sizeof(void *));
> + if (!queue->queue_pages) {
> + ehca_gen_err("Couldn't allocate queue page list");
> + return 0;
> + }
> }
> memset(queue->queue_pages, 0, nr_of_pages * sizeof(void *));
>
> @@ -240,7 +243,10 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue *queue,
> ipz_queue_ctor_exit0:
> ehca_gen_err("Couldn't alloc pages queue=%p "
> "nr_of_pages=%x", queue, nr_of_pages);
> - kfree(queue->queue_pages);
> + if (is_vmalloc_addr(queue->queue_pages))
> + vfree(queue->queue_pages);
> + else
> + kfree(queue->queue_pages);
>
> return 0;
> }
> @@ -262,7 +268,10 @@ int ipz_queue_dtor(struct ehca_pd *pd, struct ipz_queue *queue)
> free_page((unsigned long)queue->queue_pages[i]);
> }
>
> - kfree(queue->queue_pages);
> + if (is_vmalloc_addr(queue->queue_pages))
> + vfree(queue->queue_pages);
> + else
> + kfree(queue->queue_pages);
>
> return 1;
> }
>
Regards Michael
Hi Roland,
did you have a chance to take a look at the patchset and will you apply it, or
are there any outstanding issues we need to address?
Regards,
Alex
On Wed, 22 Apr 2009 16:02:28 +0200
Stefan Roscher <[email protected]> wrote:
> In case of large queue pairs there is the possibillity of allocation failures
> due to memory fragmentationo with kmalloc().To ensure the memory is allocated even
> if kmalloc() can not find chunks which are big enough, we try to allocate the memory
> with vmalloc().
>
> Signed-off-by: Stefan Roscher <[email protected]>
> ---
>
> On Tuesday 21 April 2009 07:34:30 pm Roland Dreier wrote:
> > > + queue->queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
> >
> > How big might this buffer be? Any chance of allocation failure due to
> > memory fragmentation?
> >
> > - R.
> Hey Roland,
> yes you are right and here is the patch to circumvent the described problem.
> It will apply on top of the patchset.
> regards Stefan
>
>
>
> drivers/infiniband/hw/ehca/ipz_pt_fn.c | 17 +++++++++++++----
> 1 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
> index a260559..1227c59 100644
> --- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c
> +++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
> @@ -222,8 +222,11 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue *queue,
> /* allocate queue page pointers */
> queue->queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
> if (!queue->queue_pages) {
> - ehca_gen_err("Couldn't allocate queue page list");
> - return 0;
> + queue->queue_pages = vmalloc(nr_of_pages * sizeof(void *));
> + if (!queue->queue_pages) {
> + ehca_gen_err("Couldn't allocate queue page list");
> + return 0;
> + }
> }
> memset(queue->queue_pages, 0, nr_of_pages * sizeof(void *));
>
> @@ -240,7 +243,10 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue *queue,
> ipz_queue_ctor_exit0:
> ehca_gen_err("Couldn't alloc pages queue=%p "
> "nr_of_pages=%x", queue, nr_of_pages);
> - kfree(queue->queue_pages);
> + if (is_vmalloc_addr(queue->queue_pages))
> + vfree(queue->queue_pages);
> + else
> + kfree(queue->queue_pages);
>
> return 0;
> }
> @@ -262,7 +268,10 @@ int ipz_queue_dtor(struct ehca_pd *pd, struct ipz_queue *queue)
> free_page((unsigned long)queue->queue_pages[i]);
> }
>
> - kfree(queue->queue_pages);
> + if (is_vmalloc_addr(queue->queue_pages))
> + vfree(queue->queue_pages);
> + else
> + kfree(queue->queue_pages);
>
> return 1;
> }
> did you have a chance to take a look at the patchset and will you apply it, or
> are there any outstanding issues we need to address?
I guess it's OK, but definitely 2.6.31 material. I guess I'll stick it
linux-next soon.
- R.
On Tue, 28 Apr 2009 07:01:32 -0700
Roland Dreier <[email protected]> wrote:
> > did you have a chance to take a look at the patchset and will you apply it, or
> > are there any outstanding issues we need to address?
>
> I guess it's OK, but definitely 2.6.31 material. I guess I'll stick it
> linux-next soon.
>
> - R.
Okay with us, thank you very much!
Alex
On Tue, 2009-04-21 at 17:16 +0200, Stefan Roscher wrote:
> From: Anton Blanchard <antonb at au1.ibm.com>
>
> To improve performance of driver ressource allocation,
> replace the vmalloc() call with kmalloc().
Just curious, but how big are these allocations? Why was vmalloc() even
ever used if we know they'll be small?
-- Dave
On Tuesday 28 April 2009 05:12:51 pm Dave Hansen wrote:
> On Tue, 2009-04-21 at 17:16 +0200, Stefan Roscher wrote:
> > From: Anton Blanchard <antonb at au1.ibm.com>
> >
> > To improve performance of driver ressource allocation,
> > replace the vmalloc() call with kmalloc().
>
> Just curious, but how big are these allocations? Why was vmalloc() even
> ever used if we know they'll be small?
>
> -- Dave
>
>
The theoretical maximum size can be 512k, but for common queue pairs
less than 128k is used.Because of the theoretical maximum we implemented
vmalloc() first, but recognized a huge performance impact.
-- Stefan
thanks, applied.
> From: Anton Blanchard <antonb at au1.ibm.com>
> Signed-off-by: Stefan Roscher <stefan.roscher at de.ibm.com>
please use '@' signs so these are real email addresses.
- R.