Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp2140018rwl; Sat, 8 Apr 2023 08:29:30 -0700 (PDT) X-Google-Smtp-Source: AKy350b33mCK0e4T1FlWW5CGj9LzNKeYq0mtLlBBN6eRHWAqqjY2XwyyQQmRMArwqSW2CBQB/AcH X-Received: by 2002:a17:906:73ca:b0:92f:39d9:1e50 with SMTP id n10-20020a17090673ca00b0092f39d91e50mr2211573ejl.3.1680967770523; Sat, 08 Apr 2023 08:29:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680967770; cv=none; d=google.com; s=arc-20160816; b=nFHIHMW4w/yyk1qIOZDovbXoy7EVSGWwxlfYNStShAAnxnx9NDQhV6qcvSvUlqEE/F SJlGHxqIR6ljtJQCsO3Qh5DdPsq+PMjFz8fJZp0/x/fOjku7RO8q5ODG2x+8xkjCQYPh nZXyfMj5vMD2mMYBeJvgLiGvHBhJCMvpOgqAXsEKUuqPi556IX624FNpVRHo+ilMC899 6geV586ZYl99vxISlburRyiE7dWhMi9+MyLcp9onmiG4QHy24/A7YM9pNunJKFh9F9Qy b4dMm4q3YVCgVe4rhAfkzXvRl7JvZKYBLC4EqE0+RXlxPgZekzwiNWoaAHgHKZ7fYICX ZolQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=NQlSqJYZBqhV1xIqOWoUVhZABcS4nPbhl649pilwnjY=; b=mNYNImeGGl+Tixatlt8Nb/+ACZNW/aSc8jrfu7m3VMnJRKBvU8Pt4yvEyUAGVn8+VF 51Vrmb0Zdu9IdmLcpHd0cNJiHyTcr5y38jWgMuDNMoFD/p6shhwzwyDoDookMo4Qtnh1 0DwfLjxlG6YyJtYTxEIIyaNrvDIU6u0KsmCO5N29GjEIf6tvakWP/f7HRs+yihfLNJz/ i2d5xgYXm4kFiVbtOOxhY2lmqckyQWRaYj97b7ODKFDOKrCTZo1ICMmNMB+/RJ2Capx4 ikIHMPUa9NFyTnre/VcfrElKeHjyBb7HwS4JjbTY25AvZaEUFB0sO6ux7gB5MrxaOysX 8t2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=HGqHGCty; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b5-20020a170906708500b0093defbd628fsi5909200ejk.1046.2023.04.08.08.29.06; Sat, 08 Apr 2023 08:29:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=HGqHGCty; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229932AbjDHP1y (ORCPT + 99 others); Sat, 8 Apr 2023 11:27:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229871AbjDHP1v (ORCPT ); Sat, 8 Apr 2023 11:27:51 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A47622728 for ; Sat, 8 Apr 2023 08:27:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4095D60B42 for ; Sat, 8 Apr 2023 15:27:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B24CC4339E; Sat, 8 Apr 2023 15:27:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680967667; bh=z2V/qqJcqdgDfG+2yyztCwnAhUw23li+Y5aLW0QEB5E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HGqHGCtyvZgcP/Zq9+h1wz62utCB8HpiO/9Lfm5OyGUNxkoblD1pSDJ3xgY3A3uyj /XtJSU+f4lO93vJpEyMJGV2yCKj7HPMzVZcl5J9FfiAJv/DUzlCh9mK7Fxfet3SVUo sfvbihmBbYL3IbLj4zdDQHe/jtDR8Qb0vVJaAlPKeC2miAWnBLd/9Tb3uDjC7IgSki LBYciMw1z1JxlqlKnMt2IkQoIx4I9Z46MVrxUTCAlv/3B2MXMqeKGhqYIXSKpVPMv+ V5CT6o+ppxNDBVcd0RDCn7Dm//j179CSmXSj+81burHakMlcibfT1YSDLTrb1PXl/v +yFaHbEN5cm2g== From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: Ard Biesheuvel , Herbert Xu , Eric Biggers , Kees Cook Subject: [PATCH 06/10] crypto: x86/cast6 - Use RIP-relative addressing Date: Sat, 8 Apr 2023 17:27:18 +0200 Message-Id: <20230408152722.3975985-7-ardb@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230408152722.3975985-1-ardb@kernel.org> References: <20230408152722.3975985-1-ardb@kernel.org> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3361; i=ardb@kernel.org; h=from:subject; bh=z2V/qqJcqdgDfG+2yyztCwnAhUw23li+Y5aLW0QEB5E=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIcWw/Wqd4dMC7Zo31erbithf+LL/WHhnH4eX25VkpjcVz y66iu/qKGVhEONgkBVTZBGY/ffdztMTpWqdZ8nCzGFlAhnCwMUpABNZZ8nwV+ydxoKlX11lX/BH Ckyf+jbl5/ZV937caba/qMY8Qd/J+Cgjw8bNv7PWTo+89O7vgjvPVjBoTzQW3LxnY7Gtsd3pADH tvawA X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.5 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Prefer RIP-relative addressing where possible, which removes the need for boot time relocation fixups. Signed-off-by: Ard Biesheuvel --- arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++++++++++--------- 1 file changed, 24 insertions(+), 20 deletions(-) diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S index 82b716fd5dbac65a..180fb9c78de2d315 100644 --- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S +++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S @@ -83,16 +83,20 @@ #define lookup_32bit(src, dst, op1, op2, op3, interleave_op, il_reg) \ - movzbl src ## bh, RID1d; \ - movzbl src ## bl, RID2d; \ - shrq $16, src; \ - movl s1(, RID1, 4), dst ## d; \ - op1 s2(, RID2, 4), dst ## d; \ - movzbl src ## bh, RID1d; \ - movzbl src ## bl, RID2d; \ - interleave_op(il_reg); \ - op2 s3(, RID1, 4), dst ## d; \ - op3 s4(, RID2, 4), dst ## d; + movzbl src ## bh, RID1d; \ + leaq s1(%rip), RID2; \ + movl (RID2, RID1, 4), dst ## d; \ + movzbl src ## bl, RID2d; \ + leaq s2(%rip), RID1; \ + op1 (RID1, RID2, 4), dst ## d; \ + shrq $16, src; \ + movzbl src ## bh, RID1d; \ + leaq s3(%rip), RID2; \ + op2 (RID2, RID1, 4), dst ## d; \ + movzbl src ## bl, RID2d; \ + leaq s4(%rip), RID1; \ + op3 (RID1, RID2, 4), dst ## d; \ + interleave_op(il_reg); #define dummy(d) /* do nothing */ @@ -175,10 +179,10 @@ qop(RD, RC, 1); #define shuffle(mask) \ - vpshufb mask, RKR, RKR; + vpshufb mask(%rip), RKR, RKR; #define preload_rkr(n, do_mask, mask) \ - vbroadcastss .L16_mask, RKR; \ + vbroadcastss .L16_mask(%rip), RKR; \ /* add 16-bit rotation to key rotations (mod 32) */ \ vpxor (kr+n*16)(CTX), RKR, RKR; \ do_mask(mask); @@ -258,9 +262,9 @@ SYM_FUNC_START_LOCAL(__cast6_enc_blk8) movq %rdi, CTX; - vmovdqa .Lbswap_mask, RKM; - vmovd .Lfirst_mask, R1ST; - vmovd .L32_mask, R32; + vmovdqa .Lbswap_mask(%rip), RKM; + vmovd .Lfirst_mask(%rip), R1ST; + vmovd .L32_mask(%rip), R32; inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM); inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM); @@ -284,7 +288,7 @@ SYM_FUNC_START_LOCAL(__cast6_enc_blk8) popq %rbx; popq %r15; - vmovdqa .Lbswap_mask, RKM; + vmovdqa .Lbswap_mask(%rip), RKM; outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM); outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM); @@ -306,9 +310,9 @@ SYM_FUNC_START_LOCAL(__cast6_dec_blk8) movq %rdi, CTX; - vmovdqa .Lbswap_mask, RKM; - vmovd .Lfirst_mask, R1ST; - vmovd .L32_mask, R32; + vmovdqa .Lbswap_mask(%rip), RKM; + vmovd .Lfirst_mask(%rip), R1ST; + vmovd .L32_mask(%rip), R32; inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM); inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM); @@ -332,7 +336,7 @@ SYM_FUNC_START_LOCAL(__cast6_dec_blk8) popq %rbx; popq %r15; - vmovdqa .Lbswap_mask, RKM; + vmovdqa .Lbswap_mask(%rip), RKM; outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM); outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM); -- 2.39.2