Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1148000imm; Wed, 1 Aug 2018 10:59:51 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdXcbbLo5CkXM+F8sJax92KNY/rNJH/IHuOVDfmgaZei+g6SU3U6xWD6gtL49fIoCMhVEQX X-Received: by 2002:a62:c288:: with SMTP id w8-v6mr27810144pfk.92.1533146391759; Wed, 01 Aug 2018 10:59:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533146391; cv=none; d=google.com; s=arc-20160816; b=IKEI19iKWo+0ZwU7xewf8uRb1Oghz6+FvvODkSaYR51QFg2EwseTR2XHzhUtLZ3kwr 4cMXVKJ9W9dF/1C9SJiE8r/re+U6RiOXMTunUEO26/60bdJX6ebuJ4MaM3XRsIO5nD11 3Y6JMYJZhZ8jmp3SYhDcCSfmf9lhAyLhJI+TyeepFk0xHCUf39R5l9cpsypSyR1WcAu8 8KG6QXhtbdyXxYsg06wnwP6hhggv4xrZDUxenvhEaUtdhlPQq2ZGdOflyCLwD0rodVkS 74ZqfSX0+WVSo+UgIzUC7EDOu+uJ+REDDQXl/xZQ6Hxbc7qIh8lf8odffxLD90C6KnK3 5cTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=BINCFNJm5F9ZBddoFm2kKy65qf2Ve0qxUe8u8ehuo0s=; b=Km36b2NSvivGgEp0x7npEWRIKwux078kCjwx2MBLygGtMy1TejPmC22hbGG+Xv1wTs i6obquRKcC2mtSsWJ30d3uuK9weGpWhb085W1F/1z3yJ3wbexixItwwZohbrdXG2clOQ 8M4CRE8CF1/BvRNUA2/1qRuutXiv4fUZdiDu1sxQDtxJlrJg26XH8ms2fIrKSpTosmeB oCTaeM/KwMd5tLrTCbSPuvUY43MAY20oLhCynIO2/Jj8cNSwgBNgZhxWtD0x1V43WzVV MD2r5h8+eOts2YlfLgZivgjNQG9yq7Ry2vELVPuQN5K91EFgUONLyQy1GNXbTpNL3IO9 jeiQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h4-v6si18001690pgm.441.2018.08.01.10.59.37; Wed, 01 Aug 2018 10:59:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732428AbeHATpJ (ORCPT + 99 others); Wed, 1 Aug 2018 15:45:09 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:49806 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405945AbeHATIk (ORCPT ); Wed, 1 Aug 2018 15:08:40 -0400 Received: from localhost (D57E6652.static.ziggozakelijk.nl [213.126.102.82]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 8B4681374; Wed, 1 Aug 2018 17:13:52 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Christophe Leroy , Michael Ellerman , Sasha Levin Subject: [PATCH 4.14 046/246] powerpc/lib: Adjust .balign inside string functions for PPC32 Date: Wed, 1 Aug 2018 18:49:16 +0200 Message-Id: <20180801165013.920182664@linuxfoundation.org> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20180801165011.700991984@linuxfoundation.org> References: <20180801165011.700991984@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Christophe Leroy [ Upstream commit 1128bb7813a896bd608fb622eee3c26aaf33b473 ] commit 87a156fb18fe1 ("Align hot loops of some string functions") degraded the performance of string functions by adding useless nops A simple benchmark on an 8xx calling 100000x a memchr() that matches the first byte runs in 41668 TB ticks before this patch and in 35986 TB ticks after this patch. So this gives an improvement of approx 10% Another benchmark doing the same with a memchr() matching the 128th byte runs in 1011365 TB ticks before this patch and 1005682 TB ticks after this patch, so regardless on the number of loops, removing those useless nops improves the test by 5683 TB ticks. Fixes: 87a156fb18fe1 ("Align hot loops of some string functions") Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/include/asm/cache.h | 3 +++ arch/powerpc/lib/string.S | 7 ++++--- 2 files changed, 7 insertions(+), 3 deletions(-) --- a/arch/powerpc/include/asm/cache.h +++ b/arch/powerpc/include/asm/cache.h @@ -9,11 +9,14 @@ #if defined(CONFIG_PPC_8xx) || defined(CONFIG_403GCX) #define L1_CACHE_SHIFT 4 #define MAX_COPY_PREFETCH 1 +#define IFETCH_ALIGN_SHIFT 2 #elif defined(CONFIG_PPC_E500MC) #define L1_CACHE_SHIFT 6 #define MAX_COPY_PREFETCH 4 +#define IFETCH_ALIGN_SHIFT 3 #elif defined(CONFIG_PPC32) #define MAX_COPY_PREFETCH 4 +#define IFETCH_ALIGN_SHIFT 3 /* 603 fetches 2 insn at a time */ #if defined(CONFIG_PPC_47x) #define L1_CACHE_SHIFT 7 #else --- a/arch/powerpc/lib/string.S +++ b/arch/powerpc/lib/string.S @@ -12,6 +12,7 @@ #include #include #include +#include .text @@ -23,7 +24,7 @@ _GLOBAL(strncpy) mtctr r5 addi r6,r3,-1 addi r4,r4,-1 - .balign 16 + .balign IFETCH_ALIGN_BYTES 1: lbzu r0,1(r4) cmpwi 0,r0,0 stbu r0,1(r6) @@ -43,7 +44,7 @@ _GLOBAL(strncmp) mtctr r5 addi r5,r3,-1 addi r4,r4,-1 - .balign 16 + .balign IFETCH_ALIGN_BYTES 1: lbzu r3,1(r5) cmpwi 1,r3,0 lbzu r0,1(r4) @@ -77,7 +78,7 @@ _GLOBAL(memchr) beq- 2f mtctr r5 addi r3,r3,-1 - .balign 16 + .balign IFETCH_ALIGN_BYTES 1: lbzu r0,1(r3) cmpw 0,r0,r4 bdnzf 2,1b