Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754951AbcKUQNF convert rfc822-to-8bit (ORCPT ); Mon, 21 Nov 2016 11:13:05 -0500 Received: from lhrrgout.huawei.com ([194.213.3.17]:4229 "EHLO lhrrgout.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753518AbcKUQND (ORCPT ); Mon, 21 Nov 2016 11:13:03 -0500 From: Salil Mehta To: Leon Romanovsky CC: "dledford@redhat.com" , "Huwei (Xavier)" , oulijun , "mehta.salil.lnk@gmail.com" , "linux-rdma@vger.kernel.org" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Linuxarm , "Zhangping (ZP)" Subject: RE: [PATCH for-next 03/11] IB/hns: Optimize the logic of allocating memory using APIs Thread-Topic: [PATCH for-next 03/11] IB/hns: Optimize the logic of allocating memory using APIs Thread-Index: AQHSNrnWOaTLjEdgR062ss5CfO0xkKDQRlQAgAnzTlCAASHXAIAIVBYw Date: Mon, 21 Nov 2016 16:12:38 +0000 Message-ID: References: <20161104163633.141880-1-salil.mehta@huawei.com> <20161104163633.141880-4-salil.mehta@huawei.com> <20161109072130.GH27883@leon.nu> <20161116083602.GH4240@leon.nu> In-Reply-To: <20161116083602.GH4240@leon.nu> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.203.181.153] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090204.58331D00.0125,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 49b99dae0c20ea23055a1695acaacc14 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4163 Lines: 102 > -----Original Message----- > From: Leon Romanovsky [mailto:leon@kernel.org] > Sent: Wednesday, November 16, 2016 8:36 AM > To: Salil Mehta > Cc: dledford@redhat.com; Huwei (Xavier); oulijun; > mehta.salil.lnk@gmail.com; linux-rdma@vger.kernel.org; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm; > Zhangping (ZP) > Subject: Re: [PATCH for-next 03/11] IB/hns: Optimize the logic of > allocating memory using APIs > > On Tue, Nov 15, 2016 at 03:52:46PM +0000, Salil Mehta wrote: > > > -----Original Message----- > > > From: Leon Romanovsky [mailto:leon@kernel.org] > > > Sent: Wednesday, November 09, 2016 7:22 AM > > > To: Salil Mehta > > > Cc: dledford@redhat.com; Huwei (Xavier); oulijun; > > > mehta.salil.lnk@gmail.com; linux-rdma@vger.kernel.org; > > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm; > > > Zhangping (ZP) > > > Subject: Re: [PATCH for-next 03/11] IB/hns: Optimize the logic of > > > allocating memory using APIs > > > > > > On Fri, Nov 04, 2016 at 04:36:25PM +0000, Salil Mehta wrote: > > > > From: "Wei Hu (Xavier)" > > > > > > > > This patch modified the logic of allocating memory using APIs in > > > > hns RoCE driver. We used kcalloc instead of kmalloc_array and > > > > bitmap_zero. And When kcalloc failed, call vzalloc to alloc > > > > memory. > > > > > > > > Signed-off-by: Wei Hu (Xavier) > > > > Signed-off-by: Ping Zhang > > > > Signed-off-by: Salil Mehta > > > > --- > > > > drivers/infiniband/hw/hns/hns_roce_mr.c | 15 ++++++++------- > > > > 1 file changed, 8 insertions(+), 7 deletions(-) > > > > > > > > diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c > > > b/drivers/infiniband/hw/hns/hns_roce_mr.c > > > > index fb87883..d3dfb5f 100644 > > > > --- a/drivers/infiniband/hw/hns/hns_roce_mr.c > > > > +++ b/drivers/infiniband/hw/hns/hns_roce_mr.c > > > > @@ -137,11 +137,12 @@ static int hns_roce_buddy_init(struct > > > hns_roce_buddy *buddy, int max_order) > > > > > > > > for (i = 0; i <= buddy->max_order; ++i) { > > > > s = BITS_TO_LONGS(1 << (buddy->max_order - i)); > > > > - buddy->bits[i] = kmalloc_array(s, sizeof(long), > > > GFP_KERNEL); > > > > - if (!buddy->bits[i]) > > > > - goto err_out_free; > > > > - > > > > - bitmap_zero(buddy->bits[i], 1 << (buddy->max_order - > i)); > > > > + buddy->bits[i] = kcalloc(s, sizeof(long), > GFP_KERNEL); > > > > + if (!buddy->bits[i]) { > > > > + buddy->bits[i] = vzalloc(s * sizeof(long)); > > > > > > I wonder, why don't you use directly vzalloc instead of kcalloc > > > fallback? > > As we know we will have physical contiguous pages if the kcalloc > > call succeeds. This will give us a chance to have better performance > > over the allocations which are just virtually contiguous through the > > function vzalloc(). Therefore, later has only been used as a fallback > > when our memory request cannot be entertained through kcalloc. > > > > Are you suggesting that there will not be much performance penalty > > if we use just vzalloc ? > > Not exactly, > I asked it, because we have similar code in our drivers and this > construction looks strange to me. > > 1. If performance is critical, we will use kmalloc. > 2. If performance is not critical, we will use vmalloc. > > But in this case, such construction shows me that we can live with > vmalloc performance and kmalloc allocation are not really needed. > > In your specific case, I'm not sure that kcalloc will ever fail. Performance is definitely critical here. Though, I agree this is bit unusual way of memory allocation. In actual, we were encountering memory alloc failures using kmalloc (if you see allocation amount is on the higher side and is exponential) so we ended up using vmalloc as fall back - It is very na?ve allocation scheme. Maybe we need to rethink this allocation scheme part? Also, I can pull back this particular patch for now or just live with vzalloc() till we figure out proper solution to this? > > Thanks > > > > > > > > > > > + if (!buddy->bits[i]) > > > > + goto err_out_free; > > > > + } > > > > }