From: Brian Gerst Subject: Re: [PATCH] sha: prevent removal of memset as dead store in sha1_update() Date: Thu, 25 Feb 2010 11:33:58 -0500 Message-ID: <73c1f2161002250833n120cda05s9371e5ce13cc0aac@mail.gmail.com> References: <4B8692E3.9030509@gmail.com> <19334.40337.651079.440912@pilspetsen.it.uu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Roel Kluin , Herbert Xu , "David S. Miller" , linux-crypto@vger.kernel.org, Andrew Morton , LKML To: Mikael Pettersson Return-path: In-Reply-To: <19334.40337.651079.440912@pilspetsen.it.uu.se> Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-crypto.vger.kernel.org On Thu, Feb 25, 2010 at 10:56 AM, Mikael Pettersson wr= ote: > Roel Kluin writes: > =C2=A0> Due to optimization A call to memset() may be removed as a de= ad store when > =C2=A0> the buffer is not used after its value is overwritten. > =C2=A0> > =C2=A0> Signed-off-by: Roel Kluin > =C2=A0> --- > =C2=A0> see http://cwe.mitre.org/data/slices/2000.html#14 > =C2=A0> > =C2=A0> checkpatch.pl, compile and sparse tested. Comments? > =C2=A0> > =C2=A0> diff --git a/crypto/sha1_generic.c b/crypto/sha1_generic.c > =C2=A0> index 0416091..86de0da 100644 > =C2=A0> --- a/crypto/sha1_generic.c > =C2=A0> +++ b/crypto/sha1_generic.c > =C2=A0> @@ -49,8 +49,8 @@ static int sha1_update(struct shash_desc *d= esc, const u8 *data, > =C2=A0> =C2=A0 =C2=A0 =C2=A0src =3D data; > =C2=A0> > =C2=A0> =C2=A0 =C2=A0 =C2=A0if ((partial + len) > 63) { > =C2=A0> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0u32 temp[SHA_WORKS= PACE_WORDS]; > =C2=A0> - > =C2=A0> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0u32 *temp =3D kzal= loc(SHA_WORKSPACE_WORDS * sizeof(u32), > =C2=A0> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0GFP_KERNEL); > =C2=A0> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (partial) = { > =C2=A0> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0done =3D -partial; > =C2=A0> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0memcpy(sctx->buffer + partial, data, done + 64); > =C2=A0> @@ -64,6 +64,7 @@ static int sha1_update(struct shash_desc *d= esc, const u8 *data, > =C2=A0> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0} while (done= + 63 < len); > =C2=A0> > =C2=A0> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0memset(temp, = 0, sizeof(temp)); > =C2=A0> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0kfree(temp); > =C2=A0> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0partial =3D 0= ; > =C2=A0> =C2=A0 =C2=A0 =C2=A0} > =C2=A0> =C2=A0 =C2=A0 =C2=A0memcpy(sctx->buffer + partial, src, len -= done); > > At best this might solve the issue right now, but it's not > future-proof by any margin. > > One problem is that just like the lifetimes of auto variables are > known to the compiler, allowing dead store elimination (DSE) on them, > there is development going on to make malloc() and free() known to > the compiler. I don't think it's complete yet, but once free() is > known, the sequence "memset(p, 0, n); free(p);" will obviously be > DSE:d just like in the current case with the auto variable. > > And as soon as gcc can optimize malloc() and free(), you can be sure = that > some eager kernel hacker will mark the kernel's allocators accordingl= y, > and then we're back to square one. > > I fear that the only portable (across compiler versions) and safe > solution is to invoke an assembly-coded dummy function with prototype > > =C2=A0 =C2=A0 =C2=A0 =C2=A0void use(void *p); > > and rewrite the code above as > > =C2=A0 =C2=A0 =C2=A0 =C2=A0{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0u32 temp[...]; > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0... > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0memset(temp, 0= , sizeof temp); > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0use(temp); > =C2=A0 =C2=A0 =C2=A0 =C2=A0} > > This forces the compiler to consider the buffer live after the > memset, so the memset cannot be eliminated. > > The reason the use() function needs to be in assembly code is that > with link-time optimizations soon commonplace (LTO in gcc-4.5), > a compiler can possibly discover that even an out-of-line function > > =C2=A0 =C2=A0 =C2=A0 =C2=A0void use(void *p) { } > > doesn't in fact use *p, which then enables (in theory) the > preceeding memset() to be DSE:d. Would barrier() (which is a simple memory clobber) after the memset wor= k? -- Brian Gerst