Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751255AbeACRgN (ORCPT + 1 other); Wed, 3 Jan 2018 12:36:13 -0500 Received: from mail-io0-f179.google.com ([209.85.223.179]:43279 "EHLO mail-io0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751053AbeACRgL (ORCPT ); Wed, 3 Jan 2018 12:36:11 -0500 X-Google-Smtp-Source: ACJfBosVtjEW6CFw0ljOUWGnqkQjhDhc7+a7jHBJcPrJ/HZeXYOD44nRknuYdtGWh7CutSUZJ2UxjsNk5QRwLwZPYDw= MIME-Version: 1.0 In-Reply-To: References: <20171220205213.1025257-1-arnd@arndb.de> <20171220214648.GO2353@tucnak> From: Ard Biesheuvel Date: Wed, 3 Jan 2018 17:36:09 +0000 Message-ID: Subject: Re: [PATCH] [RFT] crypto: aes-generic - turn off -ftree-pre and -ftree-sra To: Arnd Bergmann Cc: PrasannaKumar Muralidharan , Jakub Jelinek , Herbert Xu , James Morris , Richard Biener , Jakub Jelinek , "David S. Miller" , "open list:HARDWARE RANDOM NUMBER GENERATOR CORE" , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 3 January 2018 at 16:37, Arnd Bergmann wrote: > On Fri, Dec 22, 2017 at 4:47 PM, Ard Biesheuvel > wrote: >> On 21 December 2017 at 13:47, PrasannaKumar Muralidharan wrote: >>> On 21 December 2017 at 17:52, Ard Biesheuvel wrote: >>>> On 21 December 2017 at 10:20, Arnd Bergmann wrote: >>>> >>>> So my vote is to disable UBSAN for all such cipher implementations: >>>> aes_generic, but also aes_ti, which has a similar 256 byte lookup >>>> table [although it does not seem to be affected by the same issue as >>>> aes_generic], and possibly others as well. >>>> >>>> Perhaps it makes sense to move core cipher code into a separate >>>> sub-directory, and disable UBSAN at the directory level? >>>> >>>> It would involve the following files >>>> >>>> crypto/aes_generic.c >>>> crypto/aes_ti.c >>>> crypto/anubis.c >>>> crypto/arc4.c >>>> crypto/blowfish_generic.c >>>> crypto/camellia_generic.c >>>> crypto/cast5_generic.c >>>> crypto/cast6_generic.c >>>> crypto/des_generic.c >>>> crypto/fcrypt.c >>>> crypto/khazad.c >>>> crypto/seed.c >>>> crypto/serpent_generic.c >>>> crypto/tea.c >>>> crypto/twofish_generic.c >>> >>> As *SAN is enabled only on developer setup, is such a change required? >>> Looks like I am missing something here. Can you explain what value it >>> provides? >>> >> >> Well, in this particular case, the value it provides is that the >> kernel can still boot and invoke the AES code without overflowing the >> kernel stack. Of course, this is a compiler issue that hopefully gets >> fixed, but I think it may be reasonable to exclude some C code from >> UBSAN by default. > > Any idea how to proceed here? I've retested with the latest gcc snapshot > and verified that the problem is still there. No idea what the chance of > getting it fixed before the 7.3 release is. From the performance tests > I've done, the patch I posted is pretty much useless, it causes significant > performance regressions on most other compiler versions. > > A minimal patch would be to disable UBSAN specifically for aes-generic.c > for gcc-7.2+ but not gcc-8 to avoid the potential stack overflow. We could > also force building with -Os on gcc-7, and leave UBSAN enabled, > this would improve performance some 3-5% on x86 with gcc-7 (both > 7.1 and 7.2.1) and avoid the stack overflow. > Can't we just disable UBSAN for that file for all GCC versions and be done with it? It is not a production feature, and that code is unlikely to change in ways where UBSAN would make a difference anyway, nor is it ever executed on 99.9% of systems running Linux. > For the performance regression in gcc-7.2.1 on this file, I've opened > a separate gcc PR now, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651 > I've also tested the libressl version of their generic AES code, with > mixed results (it's appears to be much slower than the kernel version > to start with, and while it has further performance regressions with recent > compilers, those are with a different set of versions compared to the > kernel implementation, and it does not suffer from the high stack usage). >