Received: by 10.192.165.156 with SMTP id m28csp41492imm; Tue, 10 Apr 2018 16:00:27 -0700 (PDT) X-Google-Smtp-Source: AIpwx48Tm23NRwCofs11jWfM7KW9Ajn5w23hTkxQlCmf6NFe0ZFtcNM/VLLfyB9sH3no36PVpLiP X-Received: by 10.101.97.163 with SMTP id i3mr1616640pgv.447.1523401227007; Tue, 10 Apr 2018 16:00:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523401226; cv=none; d=google.com; s=arc-20160816; b=0OnbxDMRV4TIIQPFJ2IoHxZbQr2dUzfEL8dGvmoKmMkzbgxERzdbplf+U+vMjDLIgj m/B2imKI4NRZ0igGMNVdXqrnGhRkGkm4y0e/cDvKqZtuJI5zplbmqEbjFthZvb7dz4IY 6YDkhbqXu8K2yBlSpUcdMFykRNvgF7NlZtxXLHOy8/0ngoIN51KnXsEXVw++GFG7RrBO k2c4LdV0aT8gB1SVg1TnC9dfWfQ0aFVyFoPuc7FfDsimeIyX+yF6u5c0yesq26cJlVTX IFu0kz7YR4wheKLmUyJ04pZRJ25asBHvHDQK2ukWa7KXxKddPxAtKhr7QwVmtWfHGqik /x/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=e+6Aqe5Y7UoZxpRKVE3lwyIPAYEeLGLtmBgMsDqss5Y=; b=euEVUR1H2uwwmfei7zAbTZNE5GnSWHOGpmnzkHG0BrK7/oZqCXhjCao73YD+A0bUYi UjNUhnDBOZL0ulYUGxAC6qwSDpycwBTixnD0iRxboXyCdKZa2M03zLriBVj1ODl+16aW dvWbFJBhPn/7OmKZiaOjpn7a+xPc5Y7Of4e7yuDwg349fw7NOUIQHikivlEOIjtnuCI4 XRMGmSKlGrnEAqxiv7S8FgSqPGOAQ3YEuOGlhSa/CyDP88UqGRTQE4glUw/0CtQhmwTn 3gJcG0TFKw18P6TpGSsvpF0k3kS96VQ7Gndm3G4VKhlBoMktWxIKYMSK2BeWkJrET/RV 3tqg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l12-v6si3530676plc.696.2018.04.10.15.59.50; Tue, 10 Apr 2018 16:00:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756347AbeDJWy6 (ORCPT + 99 others); Tue, 10 Apr 2018 18:54:58 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:43698 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755800AbeDJWhL (ORCPT ); Tue, 10 Apr 2018 18:37:11 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 90C5FBB6; Tue, 10 Apr 2018 22:37:10 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Richard Biener , Jakub Jelinek , Ard Biesheuvel , Arnd Bergmann , Herbert Xu , Sasha Levin Subject: [PATCH 4.14 079/138] crypto: aes-generic - build with -Os on gcc-7+ Date: Wed, 11 Apr 2018 00:24:29 +0200 Message-Id: <20180410212911.294522896@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180410212902.121524696@linuxfoundation.org> References: <20180410212902.121524696@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Arnd Bergmann [ Upstream commit 148b974deea927f5dbb6c468af2707b488bfa2de ] While testing other changes, I discovered that gcc-7.2.1 produces badly optimized code for aes_encrypt/aes_decrypt. This is especially true when CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely large stack usage that in turn might cause kernel stack overflows: crypto/aes_generic.c: In function 'aes_encrypt': crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger than 2048 bytes [-Wframe-larger-than=] crypto/aes_generic.c: In function 'aes_decrypt': crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger than 2048 bytes [-Wframe-larger-than=] I verified that this problem exists on all architectures that are supported by gcc-7.2, though arm64 in particular is less affected than the others. I also found that gcc-7.1 and gcc-8 do not show the extreme stack usage but still produce worse code than earlier versions for this file, apparently because of optimization passes that generally provide a substantial improvement in object code quality but understandably fail to find any shortcuts in the AES algorithm. Possible workarounds include a) disabling -ftree-pre and -ftree-sra optimizations, this was an earlier patch I tried, which reliably fixed the stack usage, but caused a serious performance regression in some versions, as later testing found. b) disabling UBSAN on this file or all ciphers, as suggested by Ard Biesheuvel. This would lead to massively better crypto performance in UBSAN-enabled kernels and avoid the stack usage, but there is a concern over whether we should exclude arbitrary files from UBSAN at all. c) Forcing the optimization level in a different way. Similar to a), but rather than deselecting specific optimization stages, this now uses "gcc -Os" for this file, regardless of the CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE/SIZE option. This is a reliable workaround for the stack consumption on all architecture, and I've retested the performance results now on x86, cycles/byte (lower is better) for cbc(aes-generic) with 256 bit keys: -O2 -Os gcc-6.3.1 14.9 15.1 gcc-7.0.1 14.7 15.3 gcc-7.1.1 15.3 14.7 gcc-7.2.1 16.8 15.9 gcc-8.0.0 15.5 15.6 This implements the option c) by enabling forcing -Os on all compiler versions starting with gcc-7.1. As a workaround for PR83356, it would only be needed for gcc-7.2+ with UBSAN enabled, but since it also shows better performance on gcc-7.1 without UBSAN, it seems appropriate to use the faster version here as well. Side note: during testing, I also played with the AES code in libressl, which had a similar performance regression from gcc-6 to gcc-7.2, but was three times slower overall. It might be interesting to investigate that further and possibly port the Linux implementation into that. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651 Cc: Richard Biener Cc: Jakub Jelinek Cc: Ard Biesheuvel Signed-off-by: Arnd Bergmann Acked-by: Ard Biesheuvel Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- crypto/Makefile | 1 + 1 file changed, 1 insertion(+) --- a/crypto/Makefile +++ b/crypto/Makefile @@ -98,6 +98,7 @@ obj-$(CONFIG_CRYPTO_TWOFISH_COMMON) += t obj-$(CONFIG_CRYPTO_SERPENT) += serpent_generic.o CFLAGS_serpent_generic.o := $(call cc-option,-fsched-pressure) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149 obj-$(CONFIG_CRYPTO_AES) += aes_generic.o +CFLAGS_aes_generic.o := $(call cc-ifversion, -ge, 0701, -Os) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356 obj-$(CONFIG_CRYPTO_AES_TI) += aes_ti.o obj-$(CONFIG_CRYPTO_CAMELLIA) += camellia_generic.o obj-$(CONFIG_CRYPTO_CAST_COMMON) += cast_common.o