Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4176753imm; Wed, 30 May 2018 00:07:08 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJKUolCSa8IZMvkc70N8qZGWFRBAqKquPcilMDbzdR/uNoF5x+SX/E4aVit4PEur6nraj95 X-Received: by 2002:a17:902:2c01:: with SMTP id m1-v6mr621985plb.347.1527664028846; Wed, 30 May 2018 00:07:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527664028; cv=none; d=google.com; s=arc-20160816; b=DOTHnYf2x78vNrXLo0gEUBD+8bJVL0sNupvlnYbj5gOUo0dKHP0FHmzHy1ZE3nO1bp oN7TeujTPI1GPUdm/gtpi9IgU74kPsfsvkq4sM6NQcHz98S0+ZWRJ6MCpExBo4qb4VWY 2s7Kpba5MiYWyAm8hMl3l9rBvlDp3ZLP3cO0MJV7nkguST9e48A2R34RK9CN7IAYnCJI qb3NcKGNOIxc44i6LF6SJTfuaM22AmAJ63e/UQ4p8UcCk5Iqi+Q2DUtwh8v6QlOsRqdn YalnllN1pFPX39bJj5c1LQZ/dW6lMcz1NbjvRQ6MiQXaNxz4K5NdCyAP4k2hFkTDfMTu KArg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:date:cc:to:subject:from:references :in-reply-to:message-id:arc-authentication-results; bh=Z7v/DPSUoTw0lrRZ4mEz1VPlnKpPDEQa13aoyTdzbU8=; b=OncTFWPznv2Tnk10D67OvT2rCDbYCBUYau11gxXFRMNrNuIC9OjFuSwuk9B9Z/lTPB 0INEwY4nzAY+p0fyHblOKi75tvloLvZqWxFsQaAlOGMzD/Z2EzxgKkoP6HF0KjLm6y2v XxgIknD3nCyAD/Oot08DZ/kvVG+5iPAebxKXGDFbPQs2ZjDPfCN+H1A3Op7WpI962XlU 61B08fS1W96Km2apt+X7ftb9o3BEnk1MiFpmQIW+joSQNErdq9ezORkzVt9GZ//+Ha/E GOkHrzZqBkuWoXJxYQzzUbRZzYEG7VeT/ikLiIqZ++UJes0z6eDPvYSffaXOaK+R6Csv Ocpw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o12-v6si27440976pgc.515.2018.05.30.00.06.54; Wed, 30 May 2018 00:07:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S968600AbeE3HGU (ORCPT + 99 others); Wed, 30 May 2018 03:06:20 -0400 Received: from pegase1.c-s.fr ([93.17.236.30]:49389 "EHLO pegase1.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S937027AbeE3HGR (ORCPT ); Wed, 30 May 2018 03:06:17 -0400 Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 40whTM5ZF7z9tvQr; Wed, 30 May 2018 09:06:15 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id HHIBeWVBZHqx; Wed, 30 May 2018 09:06:15 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 40whTM4zRtz9tvMf; Wed, 30 May 2018 09:06:15 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 120B48B7BB; Wed, 30 May 2018 09:06:16 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id yuzQ2Mxs-W0u; Wed, 30 May 2018 09:06:16 +0200 (CEST) Received: from po14934vm.idsi0.si.c-s.fr (po15451.idsi0.si.c-s.fr [172.25.231.2]) by messagerie.si.c-s.fr (Postfix) with ESMTP id E8AD48B752; Wed, 30 May 2018 09:06:15 +0200 (CEST) Received: by po14934vm.idsi0.si.c-s.fr (Postfix, from userid 0) id DDA106CCAC; Wed, 30 May 2018 07:06:15 +0000 (UTC) Message-Id: In-Reply-To: References: From: Christophe Leroy Subject: [PATCH v6 2/2] powerpc/lib: optimise PPC32 memcmp To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , segher@kernel.crashing.org Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Date: Wed, 30 May 2018 07:06:15 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org At the time being, memcmp() compares two chunks of memory byte per byte. This patch optimises the comparison by comparing word by word. On the same way as commit 15c2d45d17418 ("powerpc: Add 64bit optimised memcmp"), this patch moves memcmp() into a dedicated file named memcmp_32.S A small benchmark performed on an 8xx comparing two chuncks of 512 bytes performed 100000 times gives: Before : 5852274 TB ticks After: 1488638 TB ticks This is almost 4 times faster Signed-off-by: Christophe Leroy --- arch/powerpc/lib/Makefile | 4 ++-- arch/powerpc/lib/memcmp_32.S | 45 ++++++++++++++++++++++++++++++++++++++++++++ arch/powerpc/lib/string.S | 17 ----------------- 3 files changed, 47 insertions(+), 19 deletions(-) create mode 100644 arch/powerpc/lib/memcmp_32.S diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile index 2c9b8c0adf22..d0ca13ad8231 100644 --- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -26,14 +26,14 @@ obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \ memcpy_power7.o obj64-y += copypage_64.o copyuser_64.o mem_64.o hweight_64.o \ - memcpy_64.o memcmp_64.o pmem.o + memcpy_64.o pmem.o obj64-$(CONFIG_SMP) += locks.o obj64-$(CONFIG_ALTIVEC) += vmx-helper.o obj64-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o obj-y += checksum_$(BITS).o checksum_wrappers.o \ - string_$(BITS).o + string_$(BITS).o memcmp_$(BITS).o obj-y += sstep.o ldstfp.o quad.o obj64-y += quad.o diff --git a/arch/powerpc/lib/memcmp_32.S b/arch/powerpc/lib/memcmp_32.S new file mode 100644 index 000000000000..dcb6ab45be66 --- /dev/null +++ b/arch/powerpc/lib/memcmp_32.S @@ -0,0 +1,45 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * memcmp for PowerPC32 + * + * Copyright (C) 1996 Paul Mackerras. + * + */ + +#include +#include + + .text + +_GLOBAL(memcmp) + srawi. r7, r5, 2 /* Divide len by 4 */ + mr r6, r3 + beq- 3f + mtctr r7 + li r7, 0 +1: lwzx r3, r6, r7 + lwzx r0, r4, r7 + addi r7, r7, 4 + cmplw cr0, r3, r0 + bdnzt eq, 1b + bne 5f +3: andi. r3, r5, 3 + beqlr + cmplwi cr1, r3, 2 + blt- cr1, 4f + lhzx r3, r6, r7 + lhzx r0, r4, r7 + addi r7, r7, 2 + subf. r3, r0, r3 + beqlr cr1 + bnelr +4: lbzx r3, r6, r7 + lbzx r0, r4, r7 + subf. r3, r0, r3 + blr +5: li r3, 1 + bgtlr + li r3, -1 + blr +EXPORT_SYMBOL(memcmp) diff --git a/arch/powerpc/lib/string.S b/arch/powerpc/lib/string.S index 5343a88e619e..4b41970e9ed8 100644 --- a/arch/powerpc/lib/string.S +++ b/arch/powerpc/lib/string.S @@ -54,23 +54,6 @@ _GLOBAL(strncmp) blr EXPORT_SYMBOL(strncmp) -#ifdef CONFIG_PPC32 -_GLOBAL(memcmp) - PPC_LCMPI 0,r5,0 - beq- 2f - mtctr r5 - addi r6,r3,-1 - addi r4,r4,-1 -1: lbzu r3,1(r6) - lbzu r0,1(r4) - subf. r3,r0,r3 - bdnzt 2,1b - blr -2: li r3,0 - blr -EXPORT_SYMBOL(memcmp) -#endif - _GLOBAL(memchr) PPC_LCMPI 0,r5,0 beq- 2f -- 2.13.3