Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1384541rwb; Fri, 23 Sep 2022 11:52:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7CgIFfGQRgs1YhMmnAC2A9mKlve1tss1d9/svQI/XNllW7SPqTZRGGLq68AbFWbHBogfJH X-Received: by 2002:a17:907:6e18:b0:781:a32f:d81d with SMTP id sd24-20020a1709076e1800b00781a32fd81dmr8382871ejc.12.1663959165236; Fri, 23 Sep 2022 11:52:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663959165; cv=none; d=google.com; s=arc-20160816; b=Zuom79m8njCX40J72wsYN6SZ8oqqCPFbYjx+EKGFsv1DoQFMImYn7ZFyTxU4ordGEy 8UjSVO6sJd9xPgbk4ELH6g0iivN0jpsId8TrGjIwo6VzUgRavpWs0hQfynN0HNM4s1tb NpWmRyOe7LJXkeNg0dRaT3HCJiKjJq16ukbk5O106KMIm88eqVH0M5AOOaVgbNt8n9n0 NPPdV5qcIhYHLr73dywgUnjGxIT+gL7JpuhnQFGoQNyn+Sg66DntZP6zuV1qTZTC8iPH Rrz8vf7NFIo9+nOK+H7Hgf4JjHl2A5qMETNYPM8P2j34bN2/hif0T3+2YgSjcFz7NxMo PCyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=kD2niSi8Q2s9C0iZoWvhDZfcfTDj+tYZVTvmx+Zc1Zw=; b=XWH+sbo2hOxdocPzaGLWRBdvWjxfg4Hy4ycYHkOTM4qI6Y0aDxj24bCZ9I+sKcrX7j 6+ACcvKl5dOPjl9gntEmvtq+ZJ0Jfqef95gJFWCYXcDuPswBJXl1hRgILzA91YlUQPYB C3oUoZrLEuAWAHkLpvbhJFwACEK9wrPZBeIdDLg+nakMVbZfEfnJJdzWezEBDMKBSt3Y p34tELzgOhaOfRr59MYpvErCJtuS1z31q6Vrj30eAtDWFHHFBsYE8qUywxq/Q6Hpj32/ ZOF2TnEp8smJbIRB+p2xaUFc+01xZgmXx7IwXIxbFlN2jVKp7T5GWs3NVwuClFF2BJ66 Dr6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=PM0D+7+7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dt9-20020a170907728900b0077f2c399241si9547269ejc.243.2022.09.23.11.52.19; Fri, 23 Sep 2022 11:52:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=PM0D+7+7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232021AbiIWRzc (ORCPT + 99 others); Fri, 23 Sep 2022 13:55:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232504AbiIWRz2 (ORCPT ); Fri, 23 Sep 2022 13:55:28 -0400 Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99D7110952F for ; Fri, 23 Sep 2022 10:55:27 -0700 (PDT) Received: by mail-pj1-x102b.google.com with SMTP id g1-20020a17090a708100b00203c1c66ae3so885422pjk.2 for ; Fri, 23 Sep 2022 10:55:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=kD2niSi8Q2s9C0iZoWvhDZfcfTDj+tYZVTvmx+Zc1Zw=; b=PM0D+7+7hgKH/DM4lleJ+v8ObShSmFRI8cpd628po+F9L0EcTYuFTl1JqOzFmgB/ay B4Faqi3WSoI4428phu+n7PP/zGNgp/k6XVDmE2p6RhyP5bqHiVCEwTIlsTtIFcPYuCac 0RQTkNcFMe9Z0iIa1R7jNcIdoUI8Knk+mA8/B1DSgIguDljM70M2K3YcB4OpTCpbtLnR MThPr7sTSYFS2Go8g0plnnRT5a/1EVkPi8SHpjQ8PhhcoliHhI7eyKuHCbYcO9tudO6I WUAsTi5j9A5zn5jWdAgGHKSKVN7P8Kxvys0RfmnSd1XHxi4q1wssJoWEV1FKgZrEvWFW 5zGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=kD2niSi8Q2s9C0iZoWvhDZfcfTDj+tYZVTvmx+Zc1Zw=; b=ELcqAKPSuwF0uA5DzwYWY4xMNsSMJ4w1XrVPa0unDf6IpNhO7wd4C3zRmkvU2+qpVq OmVDkAApYtTT0XjwqPoiAksJDqj1ZY7QfDpNuqu7aVo8HOj4MrrQcZDYSvB3IKQyK2cL rTWssZa/EeUMmbfZVoKeCkm4gJ9NAkKx5J6ZSDx6w9/BsY63cA3UB4tifuQJ9zgte7IZ Mtya8913+uQSqBv1SAZjshAp3vYa5rxMmZfNs4d9z20K1JHKGbvRUPXQEYmVCcVoiiFv AE4UAJwMyI+cXJHBVKLU8tZrFwrpMuuG0pY5eXB72c4cITIgBTLUxLyGK7norneRmQ39 2lgA== X-Gm-Message-State: ACrzQf2gTXJEAgfx4dj68ZOBTSM4nXkIW2JXKI2tSxVyKcq8ec7d/7B9 aK+mJy5wjOalWjxJ4HBVsxm3ofQZU5iClY4RFKTXug== X-Received: by 2002:a17:902:e74d:b0:178:796c:e1b7 with SMTP id p13-20020a170902e74d00b00178796ce1b7mr9394381plf.33.1663955726866; Fri, 23 Sep 2022 10:55:26 -0700 (PDT) MIME-Version: 1.0 References: <20220923170218.1188423-1-ndesaulniers@google.com> In-Reply-To: From: Nick Desaulniers Date: Fri, 23 Sep 2022 10:55:15 -0700 Message-ID: Subject: Re: [PATCH] x86, mem: move memmove to out of line assembler To: Linus Torvalds Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Peter Zijlstra , Kees Cook , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, Andy Lutomirski Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dropping Ma, emails bouncing. On Fri, Sep 23, 2022 at 10:30 AM Linus Torvalds wrote: > > On Fri, Sep 23, 2022 at 10:02 AM Nick Desaulniers > wrote: > > > > memmove is quite large and probably shouldn't be inlined due to size > > alone. A noinline function attribute would be the simplest fix, but > > there's a few things that stand out with the current definition: > > I don't think your patch is wrong, and it's not that different from > what the x86-64 code did long ago back in 2011 in commit 9599ec0471de > ("x86-64, mem: Convert memmove() to assembly file and fix return value > bug") > > But I wonder if we might be better off getting rid of that horrid > memmove thing entirely. We could remove __HAVE_ARCH_MEMMOVE from arch/x86/include/asm/string_32.h for ARCH=i386 then rip this arch-specific definition of memmove out. Might performance regressions be a concern with that approach? I'll write up a patch for that just to have on hand, and leave the decision up to someone else. > The original thing seems to be from 2010, > where commit 3b4b682becdf ("x86, mem: Optimize memmove for small size > and unaligned cases") did this thing for both 32-bit and 64-bit code. > > And it's really not likely used all that much - memcpy is the *much* > more important function, and that's the one that has actually been > updated for modern CPU's instead of looking like it's some copy from > glibc or whatever. I suspect that's probably where the duplicate 3 label comes from: likely some macros were expanded from another codebase's implementation, then copied into the kernel sources. > > To make things even stranger, on the x86-64 side, we actually have > *two* copies of 'memmove()' - it looks like memcpy_orig() is already a > memmove, in that it does that > > cmp %dil, %sil > jl .Lcopy_backward > > thing that seems to mean that it already handles the overlapping case. > > Anyway, that 32-bit memmove() implemented (badly) as inline asm does > need to go. As you point out, it seems to get the clobbers wrong too, > so it's actively buggy and just happens to work because there's > nothing else in that function, and it clobbers %bx that is supposed to > be callee-saved, but *presumably* the compile has to allocate %bx one > of the other registers, so it will save and restore %bx anyway. > > So that current memmove() seems to be truly horrendously buggy, but in > a "it happens to work" way. Your patch seems an improvement, but I get > the feeling that it could be improved a lot more... -- Thanks, ~Nick Desaulniers