Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp1308844pxa; Sat, 15 Aug 2020 15:20:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwJBQO/Qotv6fcNw4XdruRAtVH9zqsMSj0GHo3NLzlCZZAheb2oxcK8JQw54qlzN0ukWIyx X-Received: by 2002:aa7:df8a:: with SMTP id b10mr8308404edy.62.1597530042874; Sat, 15 Aug 2020 15:20:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597530042; cv=none; d=google.com; s=arc-20160816; b=wp21rtCXrXUI8o+8eRJK7SdeXZvuwEWTdmegALt43kpXScPKMjT/RSCz3Ssb+2CDNO jZ1jMYyObjIK9uNCJhiglk/RpKEBsYV5j4u09xROkfY2MHgyFgqy8V7LA8X5DEZkV6rN QYe2wsyEwhyraOuV99sqZEFo/Ct+MMMshtLLpC8ctmkHgavbAjufBb7mivjqakXjXp8Y KI/IIiwTgN+Qy82wYrij2aO0N0MyB6wiSeEFjSwkfz2Nik9ZrLCcTAbB5H0XHNBY5w33 rGPwbnboZsRgRXYtCHfNXQdFhSanvNX4JHTDx4U1EHJFLesXbswE9/3Y6vvMEYg+aEWg pfZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Cyp9kIv6qMIcFE2qdHMexk3h45R1hTJMbabxmKvKOOQ=; b=0gTDHXvaTY2S02gniEW/RJ1b0IIGRbdqFEg/HSaafyS0ufQgqi8yTYYpc7iJZdG7kl YQfwGRbfcGyWsx0VbTGfQ10Rf4x8vRv/5UpwF8n7Lt17RaurMy7HDHeYn7xEhH65CFOy qkcyYZ1fo5NFr0SmUF8h5KTv6OpRqqsAEVdEA7wn52VBQrxyCvBLcNUiPAVtmxsmNrIl 8DkFrTTd5a26vWiuQsLjWc/rQLTn90/jeXKX2llhy71F5p+3H8pk3HoSAxn4CRrdoKmL 1bYB8pdOsaqO6CWmVxdY7qgpM+3Rc4g6PrsNJTfOyJ5DeNya3Vppbk+/VRjRWZXOFTPZ 7GPw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=M9w7CiNQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h12si7921292ejt.546.2020.08.15.15.20.20; Sat, 15 Aug 2020 15:20:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=M9w7CiNQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728755AbgHOWRw (ORCPT + 99 others); Sat, 15 Aug 2020 18:17:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728730AbgHOWRv (ORCPT ); Sat, 15 Aug 2020 18:17:51 -0400 Received: from mail-pj1-x1044.google.com (mail-pj1-x1044.google.com [IPv6:2607:f8b0:4864:20::1044]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CE03C061385 for ; Sat, 15 Aug 2020 15:17:51 -0700 (PDT) Received: by mail-pj1-x1044.google.com with SMTP id ep8so5902759pjb.3 for ; Sat, 15 Aug 2020 15:17:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Cyp9kIv6qMIcFE2qdHMexk3h45R1hTJMbabxmKvKOOQ=; b=M9w7CiNQ1Eaom7klg4evZ/xHtzZQe125/dTYvruGKA/lL0RG1wT6rCxfZzBp/nqh5E CkJrHaBeNb+eJ4uQH9UZ3sk1gHz5gcsAJy7NSS/YIfjbRn/DAL+ubvqOj99Kp2veZBSG XFLZ20wlxoTvaqCKtmZP8PgkNSJ4WBLz7p/LAlAT7gp8TKC0u5bqlDc+p5H5FDqNg+/0 n2I01hp1TQGcKgyYfuoM9rlzlNUrsQEYwyqr1y0sryQXuiiWPzIyQ8x9cAUd0ih7/hvU 5t8NaFqV/vB29T3ru9Vte4G0HQSuOY+oElz7iH9Re2OTMmYK7FxHo3lez/tVbImogNfD y4OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Cyp9kIv6qMIcFE2qdHMexk3h45R1hTJMbabxmKvKOOQ=; b=NUAZQrweMsfxRofTMPfpDKoFmEAcrLbaH5g0VeegUdXixuxpk4+SCGNcTEhVBdosx6 TP/S+prH2Mg4u7HPdfGVDbYj9ev8uSfaEMllfKkPTNILZ3n28/dcpnmMyLlIPXQN/ALI RFH4VTpMy2IdrT1irX7r2AIBV3dF/dL8YFGwHJk5ifXlNyaw5dmESn8DyEwQCfaCgBp1 I/sspiWwbUzXgP5Q3fLKYTC+KdxkurDls3HCLijcQnmEWAPoAHlUUc+JqmBd9cnp/FQy FTOfQ1cOraakd3Zt00E9tTh3MKlvgsGvML1EfUsxlS94Ua5IOsFfK/gU6r5/0U8XYWIG xDxw== X-Gm-Message-State: AOAM53207dbU7lYDjmc8IP9ceTP7XLwI3A1NqyNTEJ+yv+IUi4EZLrgh 1LWZ0U+rLjXUJtHDhXiLkAvO8n8m9j7uJsa0AhxqwA== X-Received: by 2002:a17:90a:a10c:: with SMTP id s12mr7177172pjp.32.1597529870544; Sat, 15 Aug 2020 15:17:50 -0700 (PDT) MIME-Version: 1.0 References: <20200815014006.GB99152@rani.riverdale.lan> <20200815020946.1538085-1-ndesaulniers@google.com> <202008150921.B70721A359@keescook> <457a91183581509abfa00575d0392be543acbe07.camel@perches.com> In-Reply-To: From: Nick Desaulniers Date: Sat, 15 Aug 2020 15:17:39 -0700 Message-ID: Subject: Re: [PATCH v2] lib/string.c: implement stpcpy To: Joe Perches Cc: Kees Cook , Andrew Morton , =?UTF-8?B?RMOhdmlkIEJvbHZhbnNrw70=?= , Eli Friedman , "# 3.4.x" , Arvind Sankar , Sami Tolvanen , Vishal Verma , Dan Williams , Andy Shevchenko , "Joel Fernandes (Google)" , Daniel Axtens , Ingo Molnar , Yury Norov , Alexandru Ardelean , LKML , clang-built-linux , Rasmus Villemoes Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Aug 15, 2020 at 2:31 PM Joe Perches wrote: > > On Sat, 2020-08-15 at 14:28 -0700, Nick Desaulniers wrote: > > On Sat, Aug 15, 2020 at 2:24 PM Joe Perches wrote: > > > On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote: > > > > On Sat, Aug 15, 2020 at 9:34 AM Kees Cook wrote: > > > > > On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote: > > > > > > LLVM implemented a recent "libcall optimization" that lowers calls to > > > > > > `sprintf(dest, "%s", str)` where the return value is used to > > > > > > `stpcpy(dest, str) - dest`. This generally avoids the machinery involved > > > > > > in parsing format strings. Calling `sprintf` with overlapping arguments > > > > > > was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior. > > > > > > > > > > > > `stpcpy` is just like `strcpy` except it returns the pointer to the new > > > > > > tail of `dest`. This allows you to chain multiple calls to `stpcpy` in > > > > > > one statement. > > > > > > > > > > O_O What? > > > > > > > > > > No; this is a _terrible_ API: there is no bounds checking, there are no > > > > > buffer sizes. Anything using the example sprintf() pattern is _already_ > > > > > wrong and must be removed from the kernel. (Yes, I realize that the > > > > > kernel is *filled* with this bad assumption that "I'll never write more > > > > > than PAGE_SIZE bytes to this buffer", but that's both theoretically > > > > > wrong ("640k is enough for anybody") and has been known to be wrong in > > > > > practice too (e.g. when suddenly your writing routine is reachable by > > > > > splice(2) and you may not have a PAGE_SIZE buffer). > > > > > > > > > > But we cannot _add_ another dangerous string API. We're already in a > > > > > terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This > > > > > needs to be addressed up by removing the unbounded sprintf() uses. (And > > > > > to do so without introducing bugs related to using snprintf() when > > > > > scnprintf() is expected[4].) > > > > > > > > Well, everything (-next, mainline, stable) is broken right now (with > > > > ToT Clang) without providing this symbol. I'm not going to go clean > > > > the entire kernel's use of sprintf to get our CI back to being green. > > > > > > Maybe this should get place in compiler-clang.h so it isn't > > > generic and public. > > > > https://bugs.llvm.org/show_bug.cgi?id=47162#c7 and > > https://bugs.llvm.org/show_bug.cgi?id=47144 > > Seem to imply that Clang is not the only compiler that can lower a > > sequence of libcalls to stpcpy. Do we want to wait until we have a > > fire drill w/ GCC to move such an implementation from > > include/linux/compiler-clang.h back in to lib/string.c? > > My guess is yes, wait until gcc, if ever, needs it. The suggestion to use static inline doesn't even make sense. The compiler is lowering calls to other library routines; `stpcpy` isn't being explicitly called. Even if it was, not sure we want it being inlined. No symbol definition will be emitted; problem not solved. And I refuse to add any more code using `extern inline`. Putting the definition in lib/string.c is the most straightforward and avoids revisiting this issue in the future for other toolchains. I'll limit access by removing the declaration, and adding a comment to avoid its use. But if you're going to use a gnu target triple without using -ffreestanding because you *want* libcall optimizations, then you have to provide symbols for all possible library routines! -- Thanks, ~Nick Desaulniers