Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3258384imm; Thu, 17 May 2018 06:05:00 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpoJFB26KtkZJXfSroQzUzzcPCo788wGxKA4LSVCTUxvXMiE1k0zxgF/ouJQMmG2cBnpQLz X-Received: by 2002:a63:6142:: with SMTP id v63-v6mr4059283pgb.432.1526562300428; Thu, 17 May 2018 06:05:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526562300; cv=none; d=google.com; s=arc-20160816; b=LzU2qGv7sLtxyHQlve7tMxiZ2yDGCda7tJ7HtusFNvA9M4oN8Pj9n3xlqsWiy1kutZ El12O3WxrWtaWEMLJxiHIJWdCqwnX9Bnr+kQR+sbcRZLrM5tpO2s01E3DQlY/BXIMvrx 4HzT5w3vhGaZhQF5RQX0zYvId9xz1QgvbLxk/3Y1uC3JmSNhleNxLIBfjzJDXr0esiUD QxE1WlEjsWu1KkYoOfOUK0t2AznSXHVnagCeLl4byEIPVSVUO4wigrh9M3BEJi2t79sZ 3nXWF0p2Mndcbam30nBwuPtjIN5tYS5jG5ZFilx3WMzyf3PSDo86x+r9NLy4dVp3zT8y Z5QA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=exwmlYdolD2X7k9roy37N03hmSm3GqDjVqvgLaE1BiU=; b=KUxXhH7KozHhr3XmN0I0R1i5Arzw0VypdTMt+1KwMf4Hu70lKScm4UUrMSVrryiDKS pyD9WkmMbshj/sz9wYTChwsbQAS7jNWnZ67nlD22H9y2TT/Zlw30jxxBQtQALAxlnkaD 63yZnzb2RFM4eWZ83qr6ZJjPYNNiDxslwCCcROL7N46Qhnoq+v9C2AzE+dzrpZ7zQubZ SMHgciTEdSK5iZ0sWC5Oif/zifHSp5WpvRjjrUWnDj9BFomEEKzxHpw9+bIcOTTyPewx dLlbbXzkYRa/dc74mIrpFZKcyOZqfIyFRlAchzdoT8TcXpp609xOJ5pMBSv5JiWtRCJg qEYw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=mDgxemnI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 70-v6si5374692pfu.274.2018.05.17.06.04.21; Thu, 17 May 2018 06:05:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=mDgxemnI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751642AbeEQNDf (ORCPT + 99 others); Thu, 17 May 2018 09:03:35 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:40497 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750924AbeEQNDe (ORCPT ); Thu, 17 May 2018 09:03:34 -0400 Received: by mail-oi0-f68.google.com with SMTP id c203-v6so3883084oib.7 for ; Thu, 17 May 2018 06:03:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=exwmlYdolD2X7k9roy37N03hmSm3GqDjVqvgLaE1BiU=; b=mDgxemnITI0zzh7igxUtaEzya6qZpqQUiVIXfwJX87TD4q355okjwQnndlWoqScBja 4/K7Gn6DImvQpyrWz2C8llPG5LTwP+d3AXvjhngssVRDoqvAugLHOdRwVz7QXwFvCk1t xjBIqUNJthCYPnq92hME2u/06e1wjp7mrKlCt8Ux5D04Owj8lfnDvqwBSdSn1FtqfAN/ pTf0VItVEaHOLgAQx14/fuXRVX5XYEmnHKamyV7WcOhEHEOtRNIPusc66E6JFoFCGmze 6aKDjFXukDvNCV4dNrlWFXZ1XoZkv6RNerq0dCfROXCeF6dLjW394+rAbpQ9rd/Wy/Mn I/lQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=exwmlYdolD2X7k9roy37N03hmSm3GqDjVqvgLaE1BiU=; b=dNPeRbnEcrq7wOyTzHkj8AP3d/fNs0oh5oxGV8qQZIatzh0qopuDnr3PwdO6TlAYu0 36/Rt0acs8vs2US9xljXqBeyWUx/ks2jlkz3j0S+8EQbgoyV0X6bLd9D1HKM+pz1yNmu h4/IRR2EYVKj5gQ+TkQWXiCtCEpRM0rHRqn/yumja1QroveOluKkNL6fLKL8rMnaJojS T0aRP7Ayu6PEOTn61euAFbYxEbrzYD3mSHJKuBUj9zP/CO2VYPLqJVuE/XekUsEaecIH Z/jUs2nZXX2HJ7iyYR5vR6odG7GcS7paNZk0HOmfZr+JXDFLrFKCSYGzfmsxzt2rJ+xe sT+g== X-Gm-Message-State: ALKqPweAgmenQVx4tpybFeLi/ngpjKj5y+Y4KjXgHs34/EFG+DRE71kl la3d1l/jwsmIIZzogrPUFtgffQ/B2Kvb5D5ydr0= X-Received: by 2002:aca:1b11:: with SMTP id b17-v6mr3236042oib.173.1526562213310; Thu, 17 May 2018 06:03:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.138.3.5 with HTTP; Thu, 17 May 2018 06:03:12 -0700 (PDT) In-Reply-To: <8a6f90d882c8b60e5fa0826cd23dd70a92075659.1526553552.git.christophe.leroy@c-s.fr> References: <8a6f90d882c8b60e5fa0826cd23dd70a92075659.1526553552.git.christophe.leroy@c-s.fr> From: Mathieu Malaterre Date: Thu, 17 May 2018 15:03:12 +0200 X-Google-Sender-Auth: 9uJv1cmff5VEv032MrMGjgaS3Vk Message-ID: Subject: Re: [PATCH v2 5/5] powerpc/lib: inline memcmp() for small constant sizes To: Christophe Leroy Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linuxppc-dev , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 17, 2018 at 12:49 PM, Christophe Leroy wrote: > In my 8xx configuration, I get 208 calls to memcmp() > Within those 208 calls, about half of them have constant sizes, > 46 have a size of 8, 17 have a size of 16, only a few have a > size over 16. Other fixed sizes are mostly 4, 6 and 10. > > This patch inlines calls to memcmp() when size > is constant and lower than or equal to 16 > > In my 8xx configuration, this reduces the number of calls > to memcmp() from 208 to 123 > > The following table shows the number of TB timeticks to perform > a constant size memcmp() before and after the patch depending on > the size > > Before After Improvement > 01: 7577 5682 25% > 02: 41668 5682 86% > 03: 51137 13258 74% > 04: 45455 5682 87% > 05: 58713 13258 77% > 06: 58712 13258 77% > 07: 68183 20834 70% > 08: 56819 15153 73% > 09: 70077 28411 60% > 10: 70077 28411 60% > 11: 79546 35986 55% > 12: 68182 28411 58% > 13: 81440 35986 55% > 14: 81440 39774 51% > 15: 94697 43562 54% > 16: 79546 37881 52% > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/include/asm/string.h | 46 +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 46 insertions(+) > > diff --git a/arch/powerpc/include/asm/string.h b/arch/powerpc/include/asm/string.h > index 35f1aaad9b50..80cf0f9605dd 100644 > --- a/arch/powerpc/include/asm/string.h > +++ b/arch/powerpc/include/asm/string.h > @@ -4,6 +4,8 @@ > > #ifdef __KERNEL__ > > +#include > + > #define __HAVE_ARCH_STRNCPY > #define __HAVE_ARCH_STRNCMP > #define __HAVE_ARCH_MEMSET > @@ -51,10 +53,54 @@ static inline int strncmp(const char *p, const char *q, __kernel_size_t size) > return __strncmp(p, q, size); > } > > +static inline int __memcmp1(const void *p, const void *q, int off) Does that change anything if you change void* to char* pointer ? I find void* arithmetic hard to read. > +{ > + return *(u8*)(p + off) - *(u8*)(q + off); > +} > + > +static inline int __memcmp2(const void *p, const void *q, int off) > +{ > + return be16_to_cpu(*(u16*)(p + off)) - be16_to_cpu(*(u16*)(q + off)); > +} > + > +static inline int __memcmp4(const void *p, const void *q, int off) > +{ > + return be32_to_cpu(*(u32*)(p + off)) - be32_to_cpu(*(u32*)(q + off)); > +} > + > +static inline int __memcmp8(const void *p, const void *q, int off) > +{ > + s64 tmp = be64_to_cpu(*(u64*)(p + off)) - be64_to_cpu(*(u64*)(q + off)); I always assumed 64bits unaligned access would trigger an exception. Is this correct ? > + return tmp >> 32 ? : (int)tmp; > +} > + > +static inline int __memcmp_cst(const void *p,const void *q,__kernel_size_t size) > +{ > + if (size == 1) > + return __memcmp1(p, q, 0); > + if (size == 2) > + return __memcmp2(p, q, 0); > + if (size == 3) > + return __memcmp2(p, q, 0) ? : __memcmp1(p, q, 2); > + if (size == 4) > + return __memcmp4(p, q, 0); > + if (size == 5) > + return __memcmp4(p, q, 0) ? : __memcmp1(p, q, 4); > + if (size == 6) > + return __memcmp4(p, q, 0) ? : __memcmp2(p, q, 4); > + if (size == 7) > + return __memcmp4(p, q, 0) ? : __memcmp2(p, q, 4) ? : __memcmp1(p, q, 6); > + return __memcmp8(p, q, 0); > +} > + > static inline int memcmp(const void *p,const void *q,__kernel_size_t size) > { > if (unlikely(!size)) > return 0; > + if (__builtin_constant_p(size) && size <= 8) > + return __memcmp_cst(p, q, size); > + if (__builtin_constant_p(size) && size <= 16) > + return __memcmp8(p, q, 0) ? : __memcmp_cst(p + 8, q + 8, size - 8); > return __memcmp(p, q, size); > } > > -- > 2.13.3 >