Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp6050004iog; Thu, 23 Jun 2022 10:16:03 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sQj7cYZB2xEspv1XWcAkF4mLgWKOl99Aql3YoKVP9+THN+hoZPv2bDaNeVsjGfAgM7oFcU X-Received: by 2002:a17:906:6485:b0:712:10cd:e3b7 with SMTP id e5-20020a170906648500b0071210cde3b7mr8980794ejm.557.1656004563519; Thu, 23 Jun 2022 10:16:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656004563; cv=none; d=google.com; s=arc-20160816; b=FK9c2iD2C65JBY5cZfimDXxaKxeup4D8oq8iwo/gvnnq0O4lQtk8zhg/lTFkztcYrk 1hlxDAJHKy2kUcOcLQHOPSlZQCHAFhc3y6L3TwiCOqefBT314+NMMlCnq/p3HEgPOEcP OxcJNHG2HOujtkcbiVP4qkht/AEqQD6yPG4vYzPeSMlBVXrn0bmF2qhk3yZJgBnDIIzm hoY6MvQaff86ufqKlmXY7kWT3BgmmIJX5eQQPjgIHN2kGJ2PQM9h4jb7c8920SyuqGZq e18pW0ehWwXpNI2Lye62aiDSezC2me7Tkwl38LjTWO4DatqsfIfIaft7qG/YDiKY+MKD a17A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=ua50zadtUVB+cxzPCAD+0y6b76IBEGzBOvx7726RvwE=; b=w9z2MNr0PTrfFW2Gn9PFCjO2CJ6FdnEU4Nj50UeBBXVMZGvCHksx/fWSQtam+3rKtp jwmjtQ49SN66WVXVN8ENmbNgNmhW3hijxFucDltxBKuSpyIt1WFIVBn9alOxzQLD0/oL pJQ+4nuRAaKNkG/jD8LbmD+HcZQ6qX0kA1zXe0AICkcJ5oE0q2vSLMm0011+5jHN18Xy zckGfS+uLKH+CziUfcEFXDB+K3JEPw1TqV1LdhccTLtSuG9wp1efhBl3hzNwm0CSgglS fizHSCRoh+K0QTk08X0X6ReabiZDjAG9+72juZ9m7UWeI1IetdL3Ee51BNu67OuVCijj P9qw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=AszVpB3J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z14-20020a170906814e00b00722fc24351esi4215659ejw.410.2022.06.23.10.15.34; Thu, 23 Jun 2022 10:16:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=AszVpB3J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232069AbiFWQyU (ORCPT + 99 others); Thu, 23 Jun 2022 12:54:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233821AbiFWQvm (ORCPT ); Thu, 23 Jun 2022 12:51:42 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB8684D601; Thu, 23 Jun 2022 09:49:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 839AE61FC0; Thu, 23 Jun 2022 16:49:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3DBE4C3411B; Thu, 23 Jun 2022 16:49:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1656002993; bh=kYMTaFkDMWwk9gVOcvZPuv0uIydDT1gtwhi25M/zY3w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AszVpB3JWvP4xE+iklCDAoc5Wn6n53dOmmzqaOxFlEwCXeFR4ibvBtPLRh0A5CIl+ /paiXzvF84WF5i0hr4EzyQ1LkhhFB8onx00/9wmFMwuWJEhmVx/MwrHdaDwg55OI/2 Lun+UryNDMlSpi+vDRKDX2VLp+U4ENWemCSZFWhI= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, =?UTF-8?q?Stephan=20M=C3=BCller?= , Theodore Tso , Eric Biggers , Herbert Xu , Sasha Levin Subject: [PATCH 4.9 095/264] crypto: chacha20 - Fix chacha20_block() keystream alignment (again) Date: Thu, 23 Jun 2022 18:41:28 +0200 Message-Id: <20220623164346.760209489@linuxfoundation.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220623164344.053938039@linuxfoundation.org> References: <20220623164344.053938039@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Eric Biggers [ Upstream commit a5e9f557098e54af44ade5d501379be18435bfbf ] In commit 9f480faec58c ("crypto: chacha20 - Fix keystream alignment for chacha20_block()"), I had missed that chacha20_block() can be called directly on the buffer passed to get_random_bytes(), which can have any alignment. So, while my commit didn't break anything, it didn't fully solve the alignment problems. Revert my solution and just update chacha20_block() to use put_unaligned_le32(), so the output buffer need not be aligned. This is simpler, and on many CPUs it's the same speed. But, I kept the 'tmp' buffers in extract_crng_user() and _get_random_bytes() 4-byte aligned, since that alignment is actually needed for _crng_backtrack_protect() too. Reported-by: Stephan Müller Cc: Theodore Ts'o Signed-off-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- crypto/chacha20_generic.c | 7 ++++--- drivers/char/random.c | 24 ++++++++++++------------ include/crypto/chacha20.h | 3 +-- lib/chacha20.c | 6 +++--- 4 files changed, 20 insertions(+), 20 deletions(-) --- a/crypto/chacha20_generic.c +++ b/crypto/chacha20_generic.c @@ -23,20 +23,21 @@ static inline u32 le32_to_cpuvp(const vo static void chacha20_docrypt(u32 *state, u8 *dst, const u8 *src, unsigned int bytes) { - u32 stream[CHACHA20_BLOCK_WORDS]; + /* aligned to potentially speed up crypto_xor() */ + u8 stream[CHACHA20_BLOCK_SIZE] __aligned(sizeof(long)); if (dst != src) memcpy(dst, src, bytes); while (bytes >= CHACHA20_BLOCK_SIZE) { chacha20_block(state, stream); - crypto_xor(dst, (const u8 *)stream, CHACHA20_BLOCK_SIZE); + crypto_xor(dst, stream, CHACHA20_BLOCK_SIZE); bytes -= CHACHA20_BLOCK_SIZE; dst += CHACHA20_BLOCK_SIZE; } if (bytes) { chacha20_block(state, stream); - crypto_xor(dst, (const u8 *)stream, bytes); + crypto_xor(dst, stream, bytes); } } --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -486,9 +486,9 @@ static int crng_init_cnt = 0; static unsigned long crng_global_init_time = 0; #define CRNG_INIT_CNT_THRESH (2*CHACHA20_KEY_SIZE) static void _extract_crng(struct crng_state *crng, - __u32 out[CHACHA20_BLOCK_WORDS]); + __u8 out[CHACHA20_BLOCK_SIZE]); static void _crng_backtrack_protect(struct crng_state *crng, - __u32 tmp[CHACHA20_BLOCK_WORDS], int used); + __u8 tmp[CHACHA20_BLOCK_SIZE], int used); static void process_random_ready_list(void); static void _get_random_bytes(void *buf, int nbytes); @@ -1038,7 +1038,7 @@ static void crng_reseed(struct crng_stat unsigned long flags; int i, num; union { - __u32 block[CHACHA20_BLOCK_WORDS]; + __u8 block[CHACHA20_BLOCK_SIZE]; __u32 key[8]; } buf; @@ -1066,7 +1066,7 @@ static void crng_reseed(struct crng_stat } static void _extract_crng(struct crng_state *crng, - __u32 out[CHACHA20_BLOCK_WORDS]) + __u8 out[CHACHA20_BLOCK_SIZE]) { unsigned long flags, init_time; @@ -1084,7 +1084,7 @@ static void _extract_crng(struct crng_st spin_unlock_irqrestore(&crng->lock, flags); } -static void extract_crng(__u32 out[CHACHA20_BLOCK_WORDS]) +static void extract_crng(__u8 out[CHACHA20_BLOCK_SIZE]) { _extract_crng(select_crng(), out); } @@ -1094,7 +1094,7 @@ static void extract_crng(__u32 out[CHACH * enough) to mutate the CRNG key to provide backtracking protection. */ static void _crng_backtrack_protect(struct crng_state *crng, - __u32 tmp[CHACHA20_BLOCK_WORDS], int used) + __u8 tmp[CHACHA20_BLOCK_SIZE], int used) { unsigned long flags; __u32 *s, *d; @@ -1106,14 +1106,14 @@ static void _crng_backtrack_protect(stru used = 0; } spin_lock_irqsave(&crng->lock, flags); - s = &tmp[used / sizeof(__u32)]; + s = (__u32 *) &tmp[used]; d = &crng->state[4]; for (i=0; i < 8; i++) *d++ ^= *s++; spin_unlock_irqrestore(&crng->lock, flags); } -static void crng_backtrack_protect(__u32 tmp[CHACHA20_BLOCK_WORDS], int used) +static void crng_backtrack_protect(__u8 tmp[CHACHA20_BLOCK_SIZE], int used) { _crng_backtrack_protect(select_crng(), tmp, used); } @@ -1121,7 +1121,7 @@ static void crng_backtrack_protect(__u32 static ssize_t extract_crng_user(void __user *buf, size_t nbytes) { ssize_t ret = 0, i = CHACHA20_BLOCK_SIZE; - __u32 tmp[CHACHA20_BLOCK_WORDS]; + __u8 tmp[CHACHA20_BLOCK_SIZE] __aligned(4); int large_request = (nbytes > 256); while (nbytes) { @@ -1580,7 +1580,7 @@ static void _warn_unseeded_randomness(co */ static void _get_random_bytes(void *buf, int nbytes) { - __u32 tmp[CHACHA20_BLOCK_WORDS]; + __u8 tmp[CHACHA20_BLOCK_SIZE] __aligned(4); trace_get_random_bytes(nbytes, _RET_IP_); @@ -2167,7 +2167,7 @@ u64 get_random_u64(void) batch = raw_cpu_ptr(&batched_entropy_u64); spin_lock_irqsave(&batch->batch_lock, flags); if (batch->position % ARRAY_SIZE(batch->entropy_u64) == 0) { - extract_crng((__u32 *)batch->entropy_u64); + extract_crng((u8 *)batch->entropy_u64); batch->position = 0; } ret = batch->entropy_u64[batch->position++]; @@ -2191,7 +2191,7 @@ u32 get_random_u32(void) batch = raw_cpu_ptr(&batched_entropy_u32); spin_lock_irqsave(&batch->batch_lock, flags); if (batch->position % ARRAY_SIZE(batch->entropy_u32) == 0) { - extract_crng(batch->entropy_u32); + extract_crng((u8 *)batch->entropy_u32); batch->position = 0; } ret = batch->entropy_u32[batch->position++]; --- a/include/crypto/chacha20.h +++ b/include/crypto/chacha20.h @@ -11,13 +11,12 @@ #define CHACHA20_IV_SIZE 16 #define CHACHA20_KEY_SIZE 32 #define CHACHA20_BLOCK_SIZE 64 -#define CHACHA20_BLOCK_WORDS (CHACHA20_BLOCK_SIZE / sizeof(u32)) struct chacha20_ctx { u32 key[8]; }; -void chacha20_block(u32 *state, u32 *stream); +void chacha20_block(u32 *state, u8 *stream); void crypto_chacha20_init(u32 *state, struct chacha20_ctx *ctx, u8 *iv); int crypto_chacha20_setkey(struct crypto_tfm *tfm, const u8 *key, unsigned int keysize); --- a/lib/chacha20.c +++ b/lib/chacha20.c @@ -21,9 +21,9 @@ static inline u32 rotl32(u32 v, u8 n) return (v << n) | (v >> (sizeof(v) * 8 - n)); } -void chacha20_block(u32 *state, u32 *stream) +void chacha20_block(u32 *state, u8 *stream) { - u32 x[16], *out = stream; + u32 x[16]; int i; for (i = 0; i < ARRAY_SIZE(x); i++) @@ -72,7 +72,7 @@ void chacha20_block(u32 *state, u32 *str } for (i = 0; i < ARRAY_SIZE(x); i++) - out[i] = cpu_to_le32(x[i] + state[i]); + put_unaligned_le32(x[i] + state[i], &stream[i * sizeof(u32)]); state[12]++; }