Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4891197imm; Wed, 30 May 2018 14:14:31 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL7ui0yj8gk2AC/30IVVv5492FTRA8D7REYiRzXoy80+B0CNnDiJEJrAtqKIfvdnimU63wr X-Received: by 2002:a63:6fc8:: with SMTP id k191-v6mr3409842pgc.153.1527714871482; Wed, 30 May 2018 14:14:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527714871; cv=none; d=google.com; s=arc-20160816; b=UbSAbWeXzqso6p0mTLPTSMOqNnT70xiYQJ/zpd8BndOEqtXjtFXCr9K8nexIZX9uy5 hP7HO5fnwB1cKMeCZqXZk3VtiYTC85z9a0E25SAGqZX9/p4T1iHtKXIJ746/BWXgm9ld /J/ci7RD/ARwbCUSoBbaKMBueQCqF0BbEIBr0IEls60BAk7VXD7XWIQM4sLsljfXPoYf Q91zGm2gCPD8FkQEJxtteyqxlJs0JFpLuDzC5XV0bcasfqKL969uwyKQeNYVok1p/XxL VgqGAo0DMMSs7GV7KjBKww2wNqMQen/qx04tupQk5IkrOllb8U/jztiCmGOLA3L+ynh/ nVuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:references:in-reply-to:mime-version :dkim-signature:dkim-signature:arc-authentication-results; bh=5bM++v3llPTZOHh8ZSw0xJpIfRCsogDKILR+b3+xbSE=; b=FHYEDFsJ7LJfLuU8Ldv+NKMWs69bnz1g3RyfSleSO7wV5cacGUUM5JTOydNkCOzpgJ DkBEMnDiG8OGxPZy39jrWduRC8V1VmOqGd9G5iFe4f9xjEAKFaXJk3EhV0+NookHl9uD hpaDP4cGnSWpeew4c/WugIdbsf2RkKui4xKm2gfyaVeMxqLQ3JubeQKBnXuwjtFoWbRl izS7Gd87rTi/g7ntwpuavmcI+23CfuZ00Jq1Mt/X98aaXfcVop3PWV7TU17IG6wAM3Yr 1BafiluUGEmul+SILjBw0zIZIHGsCOVkzLWl7JfwuVM7K9fswOQOk12ajj3IMcAVmQvB ooRw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@google.com header.s=20161025 header.b=vcPkG9ha; dkim=fail header.i=@chromium.org header.s=google header.b=lIbM/KSB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h128-v6si28154367pgc.545.2018.05.30.14.14.15; Wed, 30 May 2018 14:14:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@google.com header.s=20161025 header.b=vcPkG9ha; dkim=fail header.i=@chromium.org header.s=google header.b=lIbM/KSB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753658AbeE3VMn (ORCPT + 99 others); Wed, 30 May 2018 17:12:43 -0400 Received: from mail-ua0-f193.google.com ([209.85.217.193]:41405 "EHLO mail-ua0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753275AbeE3VMk (ORCPT ); Wed, 30 May 2018 17:12:40 -0400 Received: by mail-ua0-f193.google.com with SMTP id a3-v6so13533769uad.8 for ; Wed, 30 May 2018 14:12:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-transfer-encoding; bh=5bM++v3llPTZOHh8ZSw0xJpIfRCsogDKILR+b3+xbSE=; b=vcPkG9ha/4zGao0v0Zgc1d6qZ6gPPLnciXpCSsHiDuLjVgHGrxgoU9A+q39vy0PeEy K/tNXh0p10krKzB53d7AiS3SZA8wN/2ZrVYEXBscbfZj+tQG0GllKwx2JjWyBB2XtaYu BSirJ37O6ZiUh/vkHONz01VYhurY8d0k9cGKrZj2GdNAhVoxEm7YeodPUMDChA6pBTqx 4c75yEhC3YyeGb9JPPmNbT3jeFURSw1w4zmNNdvW6Ftjz9m0nqbUy2FO0fXBG1Zrjdhl R53mnkujfxbEeKvdOCi18xiFurMl0XYBdNRaf35hZF4AsEB6FPhAPFrJfXDgR7dq52PT ExHA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-transfer-encoding; bh=5bM++v3llPTZOHh8ZSw0xJpIfRCsogDKILR+b3+xbSE=; b=lIbM/KSB32qYpeEqTnb2anvUMn4lVB/vMCSnQV2kevNUEQoHdw6c8YxeZTer3RSNdA pQdywpV/m8Xv3yJatMH6yyx/1B/ee3iMdkMvHNJQSlXATDDtCH1yFLbOf9+zMxBMmcOd aWjEUkoHuGqmHM4oaWd4MYcKqC2lwPQproFjY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-transfer-encoding; bh=5bM++v3llPTZOHh8ZSw0xJpIfRCsogDKILR+b3+xbSE=; b=nF8QmIEGhsiiAbfmzDjU1fZeRfTWNDnIkgghaY/Aipe2V1XzCaM3HC7UzNpsvLY2To zwh7yiXAyIB/1OgCh/8dVMsqZ/8eI0USuEABNarWzwsDIw6TywtcGIRf5v4gClX80Inw /kOf+Y7GrbA0nvdhyQJIekqPyRLAiLkn6kUBGV1HcmT0tClEBMUDJIZOHj9coC5COl1Q nukOtzF9ktejSu9lBuwP9lkBEvH52j5r/pNaEYP8V9J1irSiYETWunC800isxs+xK4rC 0VMRfFJzbGs2EmGGiachsCkST+DofSfP9d7Tdv5C2ZZ0WUf3jtoGH/AWlBN8hMQP/yX6 e8xw== X-Gm-Message-State: ALKqPwcblaGM9MSMgRb5ysdSpWerOqde6qhuKNJCpnM+1LY6XEBj8NA2 7raMjnMbpiZhGIxjEJBrLegt9C/hPXoWHp0I/kGKCg== X-Received: by 2002:a9f:2823:: with SMTP id c32-v6mr3020182uac.193.1527714759394; Wed, 30 May 2018 14:12:39 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a1f:a085:0:0:0:0:0 with HTTP; Wed, 30 May 2018 14:12:38 -0700 (PDT) In-Reply-To: <20180530134645.GA31844@parrot.com> References: <20180529224207.GA13354@beast> <20180530134645.GA31844@parrot.com> From: Kees Cook Date: Wed, 30 May 2018 14:12:38 -0700 X-Google-Sender-Auth: ABUQGIK4S18pknpvjvl7kC7Wwzk Message-ID: Subject: Re: [PATCH] lib/bch: Remove VLA usage To: Ivan Djelic Cc: Boris Brezillon , Brian Norris , David Woodhouse , Marek Vasut , Richard Weinberger , Linux mtd , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 30, 2018 at 6:46 AM, Ivan Djelic wrote= : > On Tue, May 29, 2018 at 03:42:07PM -0700, Kees Cook wrote: >> In the quest to remove all stack VLA usage from the kernel[1], this remo= ves >> the on-stack working buffers in favor of pre-allocated working buffers >> (which were already used in other places). Since these routines must >> already be serialized (since they work on bch->ecc_buf), adding the usag= e >> of bch->ecc_work would be similarly safe. Additionally, since "max m" is >> only 15, this was adjusted to just use a fixed size array in those cases= . > > Hi Kees, > > Using an on-stack buffer instead of a pre-allocated buffer was done initi= ally > for performance reasons. For "usual" (m,t) values (for instance m=3D13, = t=3D4), > there is a huge performance difference between the on-stack buffer versio= n and > the kmalloc version. I didn't investigate the reason for this, but I ran = a > quick benchmark on my PC: > > little-endian, type sizes: int=3D4 long=3D8 longlong=3D8 > cpu: Intel(R) Core(TM) i5 CPU 650 @ 3.20GHz > calibration: iter=3D4.9143=C2=B5s niter=3D2034 nsamples=3D200 m=3D13 t=3D= 4 > > Buffer allocation | Encoding throughput (Mbit/s) > --------------------------------------------------- > on-stack, VLA | 3988 > on-stack, fixed | 4494 > kmalloc | 1967 > > The first line shows the performance of the current code, using a VLA. > The second line shows the performance when r[] is allocated on the stack = with > a fixed, constant size (the maximum allowed value). > The third line shows the performance when r is a pre-allocated working bu= ffer. > > In fact, when using a pre-allocated buffer there is no need to introduce = 'ecc_work': > you can directly point 'r' to bch->ecc_buf and remove memcpy() surroundin= g the > 'while (mlen--)' loop. Everything happens inside the 'bch->ecc_buf' buffe= r. > But with a big performance penalty. Looks like declaring a temporary buff= er on the > stack to store ECC values allows GCC to do a better job at optimizing the= loop. > > So rather than introducing 'ecc_work', I suggest we compute the maximum a= llowed > size for r[] and use that: > > sizeof(r) =3D sizeof(uint32_t)*(l+1) > l+1 =3D BCH_ECC_WORDS(bch) =3D DIV_ROUND_UP(m*t, 32) > > We also know that: > > m*t < 2^m - 1 (ECC maximum size) > > therefore: > > l+1 < DIV_ROUND_UP(2^m - 1, 32) < 2^(m-5) > > So instead of 'uint32_t r[l+1]' we could declare 'uint32_t r[1 << (BCH_MA= X_M-5)]'. > And replace 'sizeof(r)' with 'sizeof(*bch->ecc_buf)*(l+1)' in memset/memc= py calls. > In practice the actual maximum size of r[] is (1 << (15-5))*sizeof(uint32= _t) =3D 4096 bytes. > > What do you think ? I actually did that implementation first since I didn't realize how large that allocation could get. 4096 is a HUGE stack allocation. The kernel build warns at 2048. The defaults seen during allmodconfig are: CONFIG_BCH_CONST_M=3D14 CONFIG_BCH_CONST_T=3D4 So those builds are already seeing a large stack allocation, but it was hidden from the checking tools before because it was a dynamic stack allocation: lib/bch.c: In function =E2=80=98encode_bch=E2=80=99: lib/bch.c:261:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=3D] This could be masked in the Makefile, though, since this is already the situation the code runs under. I'll send that patch... -Kees --=20 Kees Cook Pixel Security