MIME-Version: 1.0
References: <20211126143329.2689618-1-arnd@kernel.org>
In-Reply-To: <20211126143329.2689618-1-arnd@kernel.org>
From:   "Jason A. Donenfeld" <Jason@zx2c4.com>
Date:   Fri, 26 Nov 2021 10:03:16 -0500
Message-ID: <CAHmME9rotnZRzqeD43FJmSX6-i2CwvUVpXHrFkLGt+qVVdxK7A@mail.gmail.com>
Subject: Re: [PATCH] crypto: siphash - use _unaligned version by default
To:     Arnd Bergmann <arnd@kernel.org>
Cc:     Linux Crypto Mailing List <linux-crypto@vger.kernel.org>,
        Arnd Bergmann <arnd@arndb.de>,
        Ard Biesheuvel <ard.biesheuvel@linaro.org>,
        Nathan Chancellor <nathan@kernel.org>,
        Nick Desaulniers <ndesaulniers@google.com>,
        "David S. Miller" <davem@davemloft.net>,
        Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>,
        LKML <linux-kernel@vger.kernel.org>, llvm@lists.linux.dev
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Hi Arnd,

It looks like Ard's old patch never got picked up so you're dusting it
off. It looks like you're doing two things here -- moving from an
ifndef to a much nicer IS_ENABLED, and changing the logic a bit. In
trying to understand the logic part, I changed this in my buffer:

-#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
-       if (!IS_ALIGNED((unsigned long)data, HSIPHASH_ALIGNMENT))
+       if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) ||
+           !IS_ALIGNED((unsigned long)data, HSIPHASH_ALIGNMENT))
                return __hsiphash_unaligned(data, len, key);
        return ___hsiphash_aligned(data, len, key);

into this:

-       if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
-           !IS_ALIGNED((unsigned long)data, HSIPHASH_ALIGNMENT))
+       if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) ||
+           !IS_ALIGNED((unsigned long)data, HSIPHASH_ALIGNMENT))
                return __hsiphash_unaligned(data, len, key);
        return ___hsiphash_aligned(data, len, key);

This way I can actually think about what's happening here.

So with the old one, we use the faster aligned version if *either* the
CPU has efficient unaligned access OR the bytes are statically known
to be aligned. This seems sensible.

On the new one, we use the faster aligned version if *both* the bytes
are statically known to be aligned (ok) AND the CPU doesn't actually
support efficient unaligned accesses (?). This seems kind of weird.

It also means that CPUs with fast aligned accesses wind up calling the
slower code path in some cases. Is your supposition that the compiler
will always optimize the slow codepath to the fast one if the CPU it's
compiling for supports that? Have you tested this on all platforms?

Would it make sense to instead just fix clang-13? Or even to just get
rid of CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS for armv6 or undef
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS for armv6 just in this file or
maybe less messy, split CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS into
two ifdefs that more sense for our usage?

Jason