Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp564218pxf; Thu, 8 Apr 2021 08:35:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwgxR1/5wI7V4h0ygYTGKnvQ3V2/W+5XP/3iFHzftr6Y0QsHU7dg0w8G526jZSx/nsSO6mW X-Received: by 2002:a17:902:e84e:b029:e6:cbe6:34b5 with SMTP id t14-20020a170902e84eb02900e6cbe634b5mr8577481plg.42.1617896136779; Thu, 08 Apr 2021 08:35:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617896136; cv=none; d=google.com; s=arc-20160816; b=qFmHRLHS40iZKL6y87il+UuiVVKpQqjYZ0i5/ItltOT67ZirDYn6cXLLY3iaOtdelH 7R1GWeOpZi2WxKFR36KuWgAbiDaqw0vps5TTmbITZdFxsttWkcl7A/li3dIfoz2Fy+9j ad8L9EAng9vWfDTtFsrGiBj2y9L4VKYc4j9flvoOX6hagliboPtmCrk0sNJmJwI5xfFq 4kWLIRTa96Ehjv2aRSE90MJQCS8zZYotvsOB1sb0Ssmofpt8mD+V3IaxQIF/Oma6hPWU mDmmHvpKw2A8WvR2d8HhLV9wTBE6egoO3Z/PHMxYsoZwJ593bgB5y06yPgQEj9TXvf6Q 1TXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:cc:to:subject:from:message-id; bh=PFcTqBT1V40w8EviOtscw77/f5X+i/gIU2EH+ETfBEg=; b=UK8HmFM18uwbQF82Y+XYeLPnRvvdhwvxQY4+nP3BN34X7Q8AIwBB+wV/iXuhOFa9nt dwNMB6PQfV9wWuABZNgouFy2zVLp/GahdhB+uggJoYi/Zy99LuG4CvQvp/t9if9oZQ6N mb0pK8g6k7sD5p+grVD0ZLk0k0EsKs+MKYjvHxZX6skEBPLdnzKoSrwWSTOkryN+H6na Mg+BfCdDYswZnCMfhDxIu2vgyxl2V3CnDVQovpEBXTaIdpmxzuLkKjON1m6Vlm929XzI ZXpMlUn+Z9wAiG9UOZvpaFViccPlrxpFpT/+X0J9qwe60dMN55F2mcDqdIjaybEWzpOI U7XQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t27si31012245pgm.14.2021.04.08.08.35.24; Thu, 08 Apr 2021 08:35:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231995AbhDHPd6 (ORCPT + 99 others); Thu, 8 Apr 2021 11:33:58 -0400 Received: from pegase1.c-s.fr ([93.17.236.30]:65522 "EHLO pegase1.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231925AbhDHPd5 (ORCPT ); Thu, 8 Apr 2021 11:33:57 -0400 Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 4FGQK31RV4z9txf4; Thu, 8 Apr 2021 17:33:43 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id 4Z5H7bJ7Qr35; Thu, 8 Apr 2021 17:33:43 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4FGQK30hQxz9txf3; Thu, 8 Apr 2021 17:33:43 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id B32798B7D1; Thu, 8 Apr 2021 17:33:44 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id iO8E6hdJfhQS; Thu, 8 Apr 2021 17:33:44 +0200 (CEST) Received: from po16121vm.idsi0.si.c-s.fr (unknown [192.168.4.90]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 5D6608B7D0; Thu, 8 Apr 2021 17:33:44 +0200 (CEST) Received: by po16121vm.idsi0.si.c-s.fr (Postfix, from userid 0) id 2041C679BA; Thu, 8 Apr 2021 15:33:44 +0000 (UTC) Message-Id: <09da6fec57792d6559d1ea64e00be9870b02dab4.1617896018.git.christophe.leroy@csgroup.eu> From: Christophe Leroy Subject: [PATCH v1 1/2] powerpc/bitops: Use immediate operand when possible To: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Date: Thu, 8 Apr 2021 15:33:44 +0000 (UTC) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Today we get the following code generation for bitops like set or clear bit: c0009fe0: 39 40 08 00 li r10,2048 c0009fe4: 7c e0 40 28 lwarx r7,0,r8 c0009fe8: 7c e7 53 78 or r7,r7,r10 c0009fec: 7c e0 41 2d stwcx. r7,0,r8 c000c044: 39 40 20 00 li r10,8192 c000c048: 7c e0 40 28 lwarx r7,0,r8 c000c04c: 7c e7 50 78 andc r7,r7,r10 c000c050: 7c e0 41 2d stwcx. r7,0,r8 Most set bits are constant on lower 16 bits, so it can easily be replaced by the "immediate" version of the operation. Allow GCC to choose between the normal or immediate form. For clear bits, on 32 bits 'rlwinm' can be used instead or 'andc' for when all bits to be cleared are consecutive. For the time being only handle the single bit case, which we detect by checking whether the mask is a power of two. Can't use is_power_of_2() function because it is not included yet, but it is easy to code with (mask & (mask - 1)) and even the 0 case which is not a power of two is acceptable for us. On 64 bits we don't have any equivalent single operation, we'd need two 'rldicl' so it is not worth it. With this patch we get: c0009fe0: 7d 00 50 28 lwarx r8,0,r10 c0009fe4: 61 08 08 00 ori r8,r8,2048 c0009fe8: 7d 00 51 2d stwcx. r8,0,r10 c000c034: 7d 00 50 28 lwarx r8,0,r10 c000c038: 55 08 04 e2 rlwinm r8,r8,0,19,17 c000c03c: 7d 00 51 2d stwcx. r8,0,r10 On pmac32_defconfig, it reduces the text by approx 10 kbytes. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/bitops.h | 77 +++++++++++++++++++++++++++---- 1 file changed, 69 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/bitops.h b/arch/powerpc/include/asm/bitops.h index 299ab33505a6..0b0c6bdd9be9 100644 --- a/arch/powerpc/include/asm/bitops.h +++ b/arch/powerpc/include/asm/bitops.h @@ -71,19 +71,49 @@ static inline void fn(unsigned long mask, \ __asm__ __volatile__ ( \ prefix \ "1:" PPC_LLARX(%0,0,%3,0) "\n" \ - stringify_in_c(op) "%0,%0,%2\n" \ + #op "%I2 %0,%0,%2\n" \ PPC_STLCX "%0,0,%3\n" \ "bne- 1b\n" \ : "=&r" (old), "+m" (*p) \ - : "r" (mask), "r" (p) \ + : "rK" (mask), "r" (p) \ : "cc", "memory"); \ } DEFINE_BITOP(set_bits, or, "") -DEFINE_BITOP(clear_bits, andc, "") -DEFINE_BITOP(clear_bits_unlock, andc, PPC_RELEASE_BARRIER) DEFINE_BITOP(change_bits, xor, "") +#define DEFINE_CLROP(fn, prefix) \ +static inline void fn(unsigned long mask, volatile unsigned long *_p) \ +{ \ + unsigned long old; \ + unsigned long *p = (unsigned long *)_p; \ + if (IS_ENABLED(CONFIG_PPC32) && \ + __builtin_constant_p(mask) && !(mask & (mask - 1))) { \ + asm volatile ( \ + prefix \ + "1:" "lwarx %0,0,%3\n" \ + "rlwinm %0,%0,0,%2\n" \ + "stwcx. %0,0,%3\n" \ + "bne- 1b\n" \ + : "=&r" (old), "+m" (*p) \ + : "i" (~mask), "r" (p) \ + : "cc", "memory"); \ + } else { \ + asm volatile ( \ + prefix \ + "1:" PPC_LLARX(%0,0,%3,0) "\n" \ + "andc %0,%0,%2\n" \ + PPC_STLCX "%0,0,%3\n" \ + "bne- 1b\n" \ + : "=&r" (old), "+m" (*p) \ + : "r" (mask), "r" (p) \ + : "cc", "memory"); \ + } \ +} + +DEFINE_CLROP(clear_bits, "") +DEFINE_CLROP(clear_bits_unlock, PPC_RELEASE_BARRIER) + static inline void arch_set_bit(int nr, volatile unsigned long *addr) { set_bits(BIT_MASK(nr), addr + BIT_WORD(nr)); @@ -116,12 +146,12 @@ static inline unsigned long fn( \ __asm__ __volatile__ ( \ prefix \ "1:" PPC_LLARX(%0,0,%3,eh) "\n" \ - stringify_in_c(op) "%1,%0,%2\n" \ + #op "%I2 %1,%0,%2\n" \ PPC_STLCX "%1,0,%3\n" \ "bne- 1b\n" \ postfix \ : "=&r" (old), "=&r" (t) \ - : "r" (mask), "r" (p) \ + : "rK" (mask), "r" (p) \ : "cc", "memory"); \ return (old & mask); \ } @@ -130,11 +160,42 @@ DEFINE_TESTOP(test_and_set_bits, or, PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, 0) DEFINE_TESTOP(test_and_set_bits_lock, or, "", PPC_ACQUIRE_BARRIER, 1) -DEFINE_TESTOP(test_and_clear_bits, andc, PPC_ATOMIC_ENTRY_BARRIER, - PPC_ATOMIC_EXIT_BARRIER, 0) DEFINE_TESTOP(test_and_change_bits, xor, PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, 0) +static inline unsigned long test_and_clear_bits(unsigned long mask, volatile unsigned long *_p) +{ + unsigned long old, t; + unsigned long *p = (unsigned long *)_p; + + if (IS_ENABLED(CONFIG_PPC32) && + __builtin_constant_p(mask) && !(mask & (mask - 1))) { + asm volatile ( + PPC_ATOMIC_ENTRY_BARRIER + "1:" PPC_LLARX(%0,0,%3,0) "\n" + "rlwinm %1,%0,0,%2\n" + PPC_STLCX "%1,0,%3\n" + "bne- 1b\n" + PPC_ATOMIC_EXIT_BARRIER + : "=&r" (old), "=&r" (t) + : "i" (~mask), "r" (p) + : "cc", "memory"); + } else { + asm volatile ( + PPC_ATOMIC_ENTRY_BARRIER + "1:" PPC_LLARX(%0,0,%3,0) "\n" + "andc %1,%0,%2\n" + PPC_STLCX "%1,0,%3\n" + "bne- 1b\n" + PPC_ATOMIC_EXIT_BARRIER + : "=&r" (old), "=&r" (t) + : "r" (mask), "r" (p) + : "cc", "memory"); + } + + return (old & mask); +} + static inline int arch_test_and_set_bit(unsigned long nr, volatile unsigned long *addr) { -- 2.25.0