Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp280603pxv; Wed, 30 Jun 2021 05:38:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwtn33seCHQ1+0f/DqBXpvcwv+O1DrRHmXutGlbsaz3UokWEwNT+ayc6oV7jNXJt5wKxwIJ X-Received: by 2002:a17:906:c1d0:: with SMTP id bw16mr35272626ejb.146.1625056682558; Wed, 30 Jun 2021 05:38:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625056682; cv=none; d=google.com; s=arc-20160816; b=kWHLBTptz6qycq+Kh9p2Ao1UMocWoCkMXExkPWq84+nQHwkS8C3x/1e/6AAp/zLlGL Jb/thCFTTkosZ8vtL2lP9+WSaANQsAK7LeyLqntbuw7KqksjhaFyj39gFGroG5xu5lD8 6c67TidyaHk+AoXMmAlZOTutSec+cAdk+SuJVzvoTXlQHtAoeMTWsL6yWmzSRvQy05C6 yn1MPJ55y8Oqoyy24Ssxac9VDBadhGdQN7b/YsZlv7fpUk6MsvImyF2bL7GHFrbaU/IU yzPVreB66IqVXmhZv5zh4TLBnRsXOw/RmR5YPIN03w2aSBJGPwu6TAFSgrzw2D6o4NUc VQMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :to:subject; bh=tExpy0sy1x+xEqYBbD1Nk7N8QTRCz5CA2tpOdkkgJsk=; b=NywB3ZMimDFBjMq7STerbdZ1eCqVSiKRQXaEmmSHZPIVSHTjAzYLiaRRadSRV72rY9 53wKYaYkzpDUJ6uPu18xjqTVZgcrsoRcizK5IgzO/YfqzC7fZ6QKGvM/8Ghxi/poKAtc nvHFoecUEfzfFqqvdsdT5vlStFacv8lYBwwLJXFOJzftON0bsCvv/eHa2B8tBtjmZlZm R2dNF6y1klHgTa2oYO/MIAVQFWS5zxBarrJDvfrR3DHE8eQLFNIRV6ezA7oCtwOgZTk9 N8UwpT2Cn9re6+3HqthTi0QVCXWUxQdvnVOhULDnnXXqrbNmhTccQX89j50QDty2doET Cpjw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gv8si5747694ejc.710.2021.06.30.05.37.32; Wed, 30 Jun 2021 05:38:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234606AbhF3Mj4 (ORCPT + 99 others); Wed, 30 Jun 2021 08:39:56 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]:48524 "EHLO out30-133.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234583AbhF3Mj4 (ORCPT ); Wed, 30 Jun 2021 08:39:56 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R211e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=tianjia.zhang@linux.alibaba.com;NM=1;PH=DS;RN=20;SR=0;TI=SMTPD_---0UeAu494_1625056642; Received: from B-455UMD6M-2027.local(mailfrom:tianjia.zhang@linux.alibaba.com fp:SMTPD_---0UeAu494_1625056642) by smtp.aliyun-inc.com(127.0.0.1); Wed, 30 Jun 2021 20:37:23 +0800 Subject: Re: [PATCH v2 0/4] Introduce x86 assembler accelerated implementation for SM4 algorithm To: Herbert Xu , "David S. Miller" , Eric Biggers , Eric Biggers , Gilad Ben-Yossef , Ard Biesheuvel , "Markku-Juhani O . Saarinen" , Jussi Kivilinna , Catalin Marinas , Will Deacon , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , x86@kernel.org, linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Jia Zhang , "YiLin . Li" References: <20210624080857.126660-1-tianjia.zhang@linux.alibaba.com> From: Tianjia Zhang Message-ID: <46fbbc81-d8c5-fd97-fd4f-71dd96c2c522@linux.alibaba.com> Date: Wed, 30 Jun 2021 20:37:22 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210624080857.126660-1-tianjia.zhang@linux.alibaba.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Hi, Any comment? Cheers, Tianjia On 6/24/21 4:08 PM, Tianjia Zhang wrote: > This patchset extracts the public SM4 algorithm as a separate library, > At the same time, the acceleration implementation of SM4 in arm64 was > adjusted to adapt to this SM4 library. Then introduces an accelerated > implementation of the instruction set on x86. > > This optimization supports the four modes of SM4, ECB, CBC, CFB, and > CTR. Since CBC and CFB do not support multiple block parallel > encryption, the optimization effect is not obvious. And all selftests > have passed already. > > The main algorithm implementation comes from SM4 AES-NI work by > libgcrypt and Markku-Juhani O. Saarinen at: > https://github.com/mjosaarinen/sm4ni > > Benchmark on Intel Xeon Cascadelake, the data comes from the mode 218 > and mode 518 of tcrypt. The abscissas are blocks of different lengths. > The data is tabulated and the unit is Mb/s: > > sm4-generic | 16 64 128 256 1024 1420 4096 > ECB enc | 40.99 46.50 48.05 48.41 49.20 49.25 49.28 > ECB dec | 41.07 46.99 48.15 48.67 49.20 49.25 49.29 > CBC enc | 37.71 45.28 46.77 47.60 48.32 48.37 48.40 > CBC dec | 36.48 44.82 46.43 47.45 48.23 48.30 48.36 > CFB enc | 37.94 44.84 46.12 46.94 47.57 47.46 47.68 > CFB dec | 37.50 42.84 43.74 44.37 44.85 44.80 44.96 > CTR enc | 39.20 45.63 46.75 47.49 48.09 47.85 48.08 > CTR dec | 39.64 45.70 46.72 47.47 47.98 47.88 48.06 > sm4-aesni-avx > ECB enc | 33.75 134.47 221.64 243.43 264.05 251.58 258.13 > ECB dec | 34.02 134.92 223.11 245.14 264.12 251.04 258.33 > CBC enc | 38.85 46.18 47.67 48.34 49.00 48.96 49.14 > CBC dec | 33.54 131.29 223.88 245.27 265.50 252.41 263.78 > CFB enc | 38.70 46.10 47.58 48.29 49.01 48.94 49.19 > CFB dec | 32.79 128.40 223.23 244.87 265.77 253.31 262.79 > CTR enc | 32.58 122.23 220.29 241.16 259.57 248.32 256.69 > CTR dec | 32.81 122.47 218.99 241.54 258.42 248.58 256.61 > > --- > v2 changes: > * SM4 library functions use "sm4_" prefix instead of "crypto_" prefix > * sm4-aesni-avx supports accelerated implementation of four specific modes > * tcrypt benchmark supports sm4-aesni-avx > * fixes of other reviews > > Tianjia Zhang (4): > crypto: sm4 - create SM4 library based on sm4 generic code > crypto: arm64/sm4-ce - Make dependent on sm4 library instead of > sm4-generic > crypto: x86/sm4 - add AES-NI/AVX/x86_64 implementation > crypto: tcrypt - add the asynchronous speed test for SM4 > > arch/arm64/crypto/Kconfig | 2 +- > arch/arm64/crypto/sm4-ce-glue.c | 20 +- > arch/x86/crypto/Makefile | 3 + > arch/x86/crypto/sm4-aesni-avx-asm_64.S | 684 +++++++++++++++++++++++++ > arch/x86/crypto/sm4_aesni_avx_glue.c | 537 +++++++++++++++++++ > crypto/Kconfig | 22 + > crypto/sm4_generic.c | 180 +------ > crypto/tcrypt.c | 26 +- > include/crypto/sm4.h | 29 +- > lib/crypto/Kconfig | 3 + > lib/crypto/Makefile | 3 + > lib/crypto/sm4.c | 184 +++++++ > 12 files changed, 1515 insertions(+), 178 deletions(-) > create mode 100644 arch/x86/crypto/sm4-aesni-avx-asm_64.S > create mode 100644 arch/x86/crypto/sm4_aesni_avx_glue.c > create mode 100644 lib/crypto/sm4.c >