Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1037661rwb; Wed, 28 Sep 2022 12:18:20 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5hy4OBZ6Jzg8KYRK7wDdmCq61gC1VjrQD9hXBBoklYDdULk6/S/7qomTPTSD6nlDQg8RRG X-Received: by 2002:a17:902:76c7:b0:178:ab0a:19af with SMTP id j7-20020a17090276c700b00178ab0a19afmr1244468plt.164.1664392700124; Wed, 28 Sep 2022 12:18:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664392700; cv=none; d=google.com; s=arc-20160816; b=PBtRb/jwtadPNCVwRCsShXiEEDrk0UaPO98esDZlFGQyiWQSxFuJjsHid+hr3gbMvc YgIb+ihSNQljPnQ0m25ob7bbfnVuMuiCyQ54W9rcS/JAW3HYxTtiF+Mo7Y/ZJaRKr0qi LVAtqA1FiajUl91trTVIR3Vls+qT58JR6obF+dWpbyjdKKVE4aF8hzIZ0Zs03KZXz3dY mPTZgBECQ6Aaml4cO4J6KRmc254C8BScGEejXAc/dwtyeLLw/Sbt2IWeAz0OitUxBfKh P5XNhqh8P6dN5GzdF7zeTncfdyq8vynVJOxJ5JgZ1Rdpk0pQt8rNh/cCgEAVzMGAtDRv 5lkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=un08nCcMg2c1Bks3IVOtg7+kiIVa7Qq4kDC+GJMXpwI=; b=epngvvXIBdAa9iNYAkd0tKuIcL3I0d2QT41+UrGnO6BvT6wFpQ/NjSgYw4V2bNAzjJ m+MXT7j8opamvDTBXOhEnTis9aNk7aucDs7NAW7MxRhNFC6rZGWhoEOU6J2WJp28PRbc fF6Km7q2NMotCNCYHZqCG2uotWHtaIGTRXvE32lNdjNxfXNY5pxmc+hL5pyYY3Rq3HVD OLBkwJWNhD6x5z0VDEvGJM6LpQZIacmgCAhCQHDkOPpJIX2dcY9Y3nlG+S0h82nNN4ID uA6EwSq/Ai3P6CGpo7aW4n8uVppb28rvt9Mpr6DSKQx876kkBZoo6O1+hrXxfGokURYf 7DmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=WvOhAv7k; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u4-20020a170902e5c400b0016f1c879eacsi6531395plf.109.2022.09.28.12.18.08; Wed, 28 Sep 2022 12:18:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=WvOhAv7k; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234211AbiI1TAj (ORCPT + 99 others); Wed, 28 Sep 2022 15:00:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232439AbiI1TAc (ORCPT ); Wed, 28 Sep 2022 15:00:32 -0400 Received: from mail-oi1-x22e.google.com (mail-oi1-x22e.google.com [IPv6:2607:f8b0:4864:20::22e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9929A6715B for ; Wed, 28 Sep 2022 12:00:31 -0700 (PDT) Received: by mail-oi1-x22e.google.com with SMTP id t62so16383203oie.10 for ; Wed, 28 Sep 2022 12:00:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=un08nCcMg2c1Bks3IVOtg7+kiIVa7Qq4kDC+GJMXpwI=; b=WvOhAv7ka7uu1ZxTrSvTuloMqcxvP/WvnrE9siyfWUbmT7s+0yjpuLw1ZrdYRkWbMJ 9Gc/xJbpomgRAOWWoaZ6fI84rffSZQtmmnZqAeQTE2wLdA4alNEz7rEfwlbbhPxhnPdc C7ie0l/WkYdXkgLph2wxwJyati7fTe7uzlshY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=un08nCcMg2c1Bks3IVOtg7+kiIVa7Qq4kDC+GJMXpwI=; b=V/ZjxIkuqA37J/U3CvXT3krfIVpl3k7wQo/mXk0lcfPOsrp2YPawrzRh3yYxxye7IU Iz6ak22+CIiJBxGY+LaTpJ651WgShMcAMt9yS5SGRukAFoUNk/TqUdUMTU3EoMGWndbr 9/dA/KE76K56sUZ9EE6hdhLuAK/UPA6tatjhTl7aXAJCHSWKn58K669QoZ12xheUOodN T0I4EFs5XLYvw3L7zPcJYo7iHNMSmZWk4io3Bbvi0fiINfFhbwTuhJ6W7qws7BvgXCxR r6niy9FlsbEBE20z5nFIZLPm5YGDmNMKxWl3GjMvnwwAVxJfWy6PVJKcWD6OeMQybMjZ tpZw== X-Gm-Message-State: ACrzQf1OD4Ew9f7b+KtRe/OrWN0MukiT21+kJTyYBfuOeMJy56BiR3Tg XJc6EHi7KCbDXqCJRc1OO9lm6hr5h4U0fg== X-Received: by 2002:a05:6808:a05:b0:350:d047:8878 with SMTP id n5-20020a0568080a0500b00350d0478878mr5255666oij.138.1664391629775; Wed, 28 Sep 2022 12:00:29 -0700 (PDT) Received: from mail-oa1-f50.google.com (mail-oa1-f50.google.com. [209.85.160.50]) by smtp.gmail.com with ESMTPSA id m7-20020a4a9507000000b00475f26931c8sm2280823ooi.13.2022.09.28.12.00.27 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 28 Sep 2022 12:00:27 -0700 (PDT) Received: by mail-oa1-f50.google.com with SMTP id 586e51a60fabf-1319cf91d8aso4524004fac.5 for ; Wed, 28 Sep 2022 12:00:27 -0700 (PDT) X-Received: by 2002:a05:6870:c888:b0:12c:7f3b:d67d with SMTP id er8-20020a056870c88800b0012c7f3bd67dmr6575667oab.229.1664391627405; Wed, 28 Sep 2022 12:00:27 -0700 (PDT) MIME-Version: 1.0 References: <202209271333.10AE3E1D@keescook> <20220927210248.3950201-1-ndesaulniers@google.com> In-Reply-To: From: Linus Torvalds Date: Wed, 28 Sep 2022 12:00:11 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3] x86, mem: move memmove to out of line assembler To: Rasmus Villemoes Cc: Nick Desaulniers , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H . Peter Anvin" , Peter Zijlstra , Kees Cook , linux-kernel@vger.kernel.org, llvm@lists.linux.dev, Andy Lutomirski Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 28, 2022 at 12:24 AM Rasmus Villemoes wrote: > > > + /* > > + * movs instruction have many startup latency > > + * so we handle small size by general register. > > + */ > > + cmpl $680, n > > + jb .Ltoo_small_forwards > > OK, this I get, there's some overhead, and hence we need _some_ cutoff > value; 680 is probably chosen by some trial-and-error, but the exact > value likely doesn't matter too much. > > > + /* > > + * movs instruction is only good for aligned case. > > + */ > > + movl src, tmp0 > > + xorl dest, tmp0 > > + andl $0xff, tmp0 > > + jz .Lforward_movs > > But this part I don't understand at all. This checks that the src and > dest have the same %256 value, which is a rather odd thing, Both of these checks basically reflect the time the original code was added, back in 2011, and are basically "that was the "rep movs implementation of the time". Neither of them is very relevant today, and not the right way to check anyway (ie FSRM should replace that test for 680 bytes etc). But fixing the code to check the right things should probably be a separate issue from the "move from inline asm to explicit asm", so I think the patch is right this way. Linus