Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp1087903ybz; Wed, 22 Apr 2020 13:18:14 -0700 (PDT) X-Google-Smtp-Source: APiQypKUvF2VCm0MNljvOxh8IOeCh18PNsVw1aA/KAyO9Yn6Fj/1RiK3cSJOkyQ5R3kNZZVTyF6R X-Received: by 2002:a17:906:4ecb:: with SMTP id i11mr85759ejv.79.1587586694452; Wed, 22 Apr 2020 13:18:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587586694; cv=none; d=google.com; s=arc-20160816; b=WP6nckOlTA4pGort+JBrNu4RveGw5WZAJi64HDtJ5k8m867k+L5/1WDd/2Asygm803 XZbMNxdTfg3FYCw2tvdDIr0xKQ1HoQCwlRVV0jtHmu1Zb1mK9MIRiwbEpIxglfIFZ8aS cIsCcrRCAomRe3DL34y7zyOkxMb6nU2828BmkyHHLhd9d62OI5QeUHT93epXpDgtjIyi 0Ojno+1nzUvyPnCoDCLRgc5vZW02FmjxjNaZns6GJpi1WMzJtF1C+MRNcCNVC+j8Q0Sc ueCpma7ciznTJ+aZ57kvd4UM4gO5dOiEkbqczMxyYmeQ+Zh0mD5RzkIw+SjgVDr891ow K1+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=alhpOhLxEku+hxjKhgREFJYY3lOn/82rtBXhXVWmKYc=; b=WPqLlWhr+GmxqmbMcTZ2BG44DSIamimzFLl/BonJ3WZT28A0F9V9RYSyy62r7dXbZ+ nACurJSSH5lARsA0s+Gd1ZnNHvzs4k+Mn4iDMYEo6wrO+c8nk+1v2P/DtMoNNUhG4CNm lehJJtFJbw2hutBtF543g2TqoErgvc7/O9uVEc4bMdomKyl3XE7x2/2eazOHruskOkFz /dMQ7Dd0AMqqU9rBvn179xaEauDbCviyUkzxXOlNuZJgYOXt4KX98oVYc+A2fZeuncc4 IrNX96MxjmJIV5D/pg68gwkf/DWDx5EAY7i1eo1w14RmglaLE7Bin/Nw+NAo4yN6pPbJ oMwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zx2c4.com header.s=mail header.b=g6cvQ0mB; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zx2c4.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c12si107916edv.443.2020.04.22.13.17.45; Wed, 22 Apr 2020 13:18:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@zx2c4.com header.s=mail header.b=g6cvQ0mB; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zx2c4.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726512AbgDVURo (ORCPT + 99 others); Wed, 22 Apr 2020 16:17:44 -0400 Received: from mail.zx2c4.com ([192.95.5.64]:34387 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725779AbgDVURn (ORCPT ); Wed, 22 Apr 2020 16:17:43 -0400 Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTP id 5d86e8b1; Wed, 22 Apr 2020 20:06:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=zx2c4.com; h=mime-version :references:in-reply-to:from:date:message-id:subject:to:cc :content-type; s=mail; bh=zcAqlsS6cbJG+9ZYLKubSmS1+4E=; b=g6cvQ0 mBzP+iPykSyvY4aC9vay4EW/eOKh++k2StorKsZG8KSB6v5eFkQlIg2+ebDfyg7i KGISl+E9XBYJ0QU4/x8BZ376+rbdgF6Vo33zoCc/5ZyAghtXhswLRm6MmW42iUaC i7mNpDBcfVymXJUGUH/NqiJgT/V6y1yi6qqtR2huKeF/7pYsS4wtmC320tZG7kSA 2a1PzDcQhWSanXwNVMC7Rmb1sym+sjCygvZnGIjEBOAlkY/s+FWdV7c1U2PYB+ci KbzyJUL62dRxT45/JPY/Zau6c0jukMD6PYDap/6FCcSXpsO9chNDuOPNlirZaL4f IAyim/Uu747aoWLA== Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 850def34 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 22 Apr 2020 20:06:45 +0000 (UTC) Received: by mail-io1-f48.google.com with SMTP id 19so3880811ioz.10; Wed, 22 Apr 2020 13:17:39 -0700 (PDT) X-Gm-Message-State: AGi0PuYlG+XLCi2JR3fVxl+cskGrZ2NrfOB8eNUfhVX310Fj6hMqp0DE aXQFJNm3hKtHxNTRzW3ZBzv1a8BpAS+lIjKuJwQ= X-Received: by 2002:a05:6638:4e:: with SMTP id a14mr171449jap.108.1587586659063; Wed, 22 Apr 2020 13:17:39 -0700 (PDT) MIME-Version: 1.0 References: <20200420075711.2385190-1-Jason@zx2c4.com> <20200422040415.GA2881@sol.localdomain> In-Reply-To: From: "Jason A. Donenfeld" Date: Wed, 22 Apr 2020 14:17:28 -0600 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH crypto-stable] crypto: arch/lib - limit simd usage to PAGE_SIZE chunks To: Ard Biesheuvel Cc: Eric Biggers , Herbert Xu , Linux Crypto Mailing List , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Wed, Apr 22, 2020 at 1:51 PM Jason A. Donenfeld wrote: > > On Wed, Apr 22, 2020 at 1:39 AM Ard Biesheuvel wrote: > > > > On Wed, 22 Apr 2020 at 09:32, Jason A. Donenfeld wrote: > > > > > > On Tue, Apr 21, 2020 at 10:04 PM Eric Biggers wrote: > > > > Seems this should just be a 'while' loop? > > > > > > > > while (bytes) { > > > > unsigned int todo = min_t(unsigned int, PAGE_SIZE, bytes); > > > > > > > > kernel_neon_begin(); > > > > chacha_doneon(state, dst, src, todo, nrounds); > > > > kernel_neon_end(); > > > > > > > > bytes -= todo; > > > > src += todo; > > > > dst += todo; > > > > } > > > > > > The for(;;) is how it's done elsewhere in the kernel (that this patch > > > doesn't touch), because then we can break out of the loop before > > > having to increment src and dst unnecessarily. Likely a pointless > > > optimization as probably the compiler can figure out how to avoid > > > that. But maybe it can't. If you have a strong preference, I can > > > reactor everything to use `while (bytes)`, but if you don't care, > > > let's keep this as-is. Opinion? > > > > > > > Since we're bikeshedding, I'd prefer 'do { } while (bytes);' here, > > given that bytes is guaranteed to be non-zero before we enter the > > loop. But in any case, I'd prefer avoiding for(;;) or while(1) where > > we can. > > Okay, will do-while it up for v2. I just sent v2 containing do-while, and I'm fine with that going in that way. But just in the interest of curiosity in the pan-tone palette, check this out: https://godbolt.org/z/VxXien It looks like on mine, the compiler avoids unnecessarily calling those adds on the last iteration, but on the other hand, it results in an otherwise unnecessary unconditional jump for the < 4096 case. Sort of interesting. Arm64 code is more or less the same difference too.