Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp3451669ybp; Sun, 6 Oct 2019 12:13:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqz+ce5LMuOX7/OiWaRiXP8CHO9j69bnpb9sUi/1nJ/RC7nK9JzbL73yTieHORw7xM4gop5B X-Received: by 2002:a50:fc0c:: with SMTP id i12mr25963998edr.82.1570389199366; Sun, 06 Oct 2019 12:13:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570389199; cv=none; d=google.com; s=arc-20160816; b=BvZrlR7roKjEUzz4Cy3GBr0gpR7Cj7YnzKkzubfoBcTD416xlfduUA6MkBI322+IzA opoYtIozgi/AzmKngO1066XbJupDqFFwZZ/Gyfc91y324HoKiJIoRg06pOkpfOIINyon g8vWx7Qb1zAgZzz+hsqVPT8MrClZdi0zk1czE39MCXOOwelgQFW34kNZneZ1q7Y8NzVL sH9xkRq0PxR7oU35gnUuEwOI8JX7s5yrlEGf0rVlepaBoUfW3wrZewUu00JnKvn8//8G M0Y/IFgSKDzVVGpEOiEy0OSr+rlgGp7UUer4+BUDK8tmw1LmMYtusmPyfhziemw7WI5r x+vA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-disposition:mime-version:user-agent:in-reply-to:references :subject:cc:to:from:message-id:date:dkim-signature:dkim-filter; bh=pYYvIlINVMFr28wexA57nXUsHHmBbxLOveUHLaVgk6o=; b=E4Rbanr5wAg9N6aPgvxiyTKBkhoi9C3u3LMBykYQS4hR8CcXs1fnONEzfymunF6ynB CDm1S5tW14siYL2yTWasRhzYuxADaAXpQ4EA/r/5q5fGAc6jx2cbsiYYVm7IFE8Ju0Wn TqvXxQz1+fcpnCzxarTBMrL+ZA/blJpPDWRmRXHDpcr+NnRDtx+lyIioFIToETRG6vVk 00lUSzCyUNKLlp812WazaEAAhvFDBirCl5SSqsa/6MCpxvGk4wpBRQTpSavu/xjWvb1A EVbJalGQfCtkLs0cLMBnSGdvPxElxcnWEJH6liR5noNmwwtE72aapDXp47thIgVRw9nq vSMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vdorst.com header.s=default header.b=FESceyk6; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=vdorst.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id rh23si6049014ejb.75.2019.10.06.12.12.41; Sun, 06 Oct 2019 12:13:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@vdorst.com header.s=default header.b=FESceyk6; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=vdorst.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726048AbfJFTMb (ORCPT + 99 others); Sun, 6 Oct 2019 15:12:31 -0400 Received: from mx.0dd.nl ([5.2.79.48]:54026 "EHLO mx.0dd.nl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726030AbfJFTMa (ORCPT ); Sun, 6 Oct 2019 15:12:30 -0400 Received: from mail.vdorst.com (mail.vdorst.com [IPv6:fd01::250]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx.0dd.nl (Postfix) with ESMTPS id 455095FBCA; Sun, 6 Oct 2019 21:12:29 +0200 (CEST) Authentication-Results: mx.0dd.nl; dkim=pass (2048-bit key; secure) header.d=vdorst.com header.i=@vdorst.com header.b="FESceyk6"; dkim-atps=neutral Received: from www (www.vdorst.com [192.168.2.222]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.vdorst.com (Postfix) with ESMTPSA id EDD593AD77; Sun, 6 Oct 2019 21:12:28 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.vdorst.com EDD593AD77 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vdorst.com; s=default; t=1570389149; bh=pYYvIlINVMFr28wexA57nXUsHHmBbxLOveUHLaVgk6o=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=FESceyk6OONGviZ9JELPfVQ0D+8ypxw2m5AMDkQfXkKaqdofqU2PoqJ8Sah3p6G6L NS+g4clG+7noRHd2eE1QxqCEkd2KQzmxlKI8iYEZbYd83oVfZoTBI50ijCf54oKMKJ 0SHaf5K0iopjnO3Oww6dJxHCxnBTrP4rAbV+i73jQIdt9vTodslblaACBDKzw4jIv+ Qn8BqVrhrxTaTn+Tzci/ssxo1gKElqYTy+E83CzpWXs/8lTuX4EOYehCjmSTp4vwtK 1W8a0gaRt2s5cT/kIyTu1JzlURVngldrSAzKfZ0qQZqvpAvFLjufAalzjBT8COgLmk gnw6vw+/VoQ4Q== Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by www.vdorst.com (Horde Framework) with HTTPS; Sun, 06 Oct 2019 19:12:28 +0000 Date: Sun, 06 Oct 2019 19:12:28 +0000 Message-ID: <20191006191228.Horde.E8aAava9O1UOhVnxdaZzfqw@www.vdorst.com> From: =?utf-8?b?UmVuw6k=?= van Dorst To: Ard Biesheuvel Cc: "Jason A. Donenfeld" , "open list:HARDWARE RANDOM NUMBER GENERATOR CORE" , Herbert Xu , David Miller , Greg KH , Linus Torvalds , Samuel Neves , Dan Carpenter , Arnd Bergmann , Eric Biggers , Andy Lutomirski , Will Deacon , Marc Zyngier , Catalin Marinas , Martin Willi , Peter Zijlstra , Josh Poimboeuf Subject: Re: [PATCH v2 05/20] crypto: mips/chacha - import accelerated 32r2 code from Zinc References: <20191002141713.31189-1-ard.biesheuvel@linaro.org> <20191002141713.31189-6-ard.biesheuvel@linaro.org> <20191004134644.GE112631@zx2c4.com> <20191004151524.Horde.zXUzQP5eBQt7Ybx5I75Ig5X@www.vdorst.com> In-Reply-To: User-Agent: Horde Application Framework 5 Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: 8bit Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Quoting Ard Biesheuvel : Hi Ard, > Thanks a lot for taking the time to double check this. I think it > would be nice to be able to expose xchacha12 like we do on other > architectures. > > Note that for xchacha, I also added a hchacha_block() routine based on > your code (with the round count as the third argument) [0]. Please let > me know if you see anything wrong with that. > > > +.globl hchacha_block > +.ent hchacha_block > +hchacha_block: > + .frame $sp, STACK_SIZE, $ra > + > + addiu $sp, -STACK_SIZE > + > + /* Save s0-s7 */ > + sw $s0, 0($sp) > + sw $s1, 4($sp) > + sw $s2, 8($sp) > + sw $s3, 12($sp) > + sw $s4, 16($sp) > + sw $s5, 20($sp) > + sw $s6, 24($sp) > + sw $s7, 28($sp) We only have to preserve the used s registers. Currently X11 to X15 are using the registers s6 down to s2. But by shuffling/redefine the needed registers, so that we use all the non-preserve registers, I can reduce the used s registers to one. Registers we don't use and don't have to preserve are a3, at and v0. Also STATE(a0) can be reused because we only need that pointer while loading the values from memory. So: #undef X12 #undef X13 #undef X14 #undef X15 #define X12 $a3 #define X13 $at #define X14 $v0 #define X15 STATE And save X11(s6) on the stack. See the full code here [0]. For the rest the code looks good! Greats, René [0]: https://github.com/vDorst/wireguard/commit/562a516ae3b282b32f57d3239369360bc926df60 > + > + lw X0, 0(STATE) > + lw X1, 4(STATE) > + lw X2, 8(STATE) > + lw X3, 12(STATE) > + lw X4, 16(STATE) > + lw X5, 20(STATE) > + lw X6, 24(STATE) > + lw X7, 28(STATE) > + lw X8, 32(STATE) > + lw X9, 36(STATE) > + lw X10, 40(STATE) > + lw X11, 44(STATE) > + lw X12, 48(STATE) > + lw X13, 52(STATE) > + lw X14, 56(STATE) > + lw X15, 60(STATE) > + > +.Loop_hchacha_xor_rounds: > + addiu $a2, -2 > + AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 16); > + AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 12); > + AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 8); > + AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 7); > + AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 16); > + AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 12); > + AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 8); > + AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 7); > + bnez $a2, .Loop_hchacha_xor_rounds > + > + sw X0, 0(OUT) > + sw X1, 4(OUT) > + sw X2, 8(OUT) > + sw X3, 12(OUT) > + sw X12, 16(OUT) > + sw X13, 20(OUT) > + sw X14, 24(OUT) > + sw X15, 28(OUT) > + > + /* Restore used registers */ > + lw $s0, 0($sp) > + lw $s1, 4($sp) > + lw $s2, 8($sp) > + lw $s3, 12($sp) > + lw $s4, 16($sp) > + lw $s5, 20($sp) > + lw $s6, 24($sp) > + lw $s7, 28($sp) > + > + addiu $sp, STACK_SIZE > + jr $ra > +.end hchacha_block > +.set at > > > [0] > https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=wireguard-crypto-library-api-v3&id=cc74a037f8152d52bd17feaf8d9142b61761484f