Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1138447rwb; Wed, 28 Sep 2022 13:51:30 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4RW2UFXYyxgiyOYSpKBSRHw8b3SRYlefj61s32gxDN1YysfKNTBWAwlfvUfmq3Vzwyv/P3 X-Received: by 2002:a17:906:da86:b0:740:7120:c6e6 with SMTP id xh6-20020a170906da8600b007407120c6e6mr28982225ejb.44.1664398289825; Wed, 28 Sep 2022 13:51:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664398289; cv=none; d=google.com; s=arc-20160816; b=ZGjYARnxf/jugGxrXh1Qb1hbU4YZFzyRDra8+k5p/WlkmMatb9tTVmUJFZyxAJRw3s M7HStL14Rzo4NFJwOg49umRW9yWabWw7wcYJ+eP7U+qj1ZgRlZoSDglKWn0XTTCqjhyw o+NKdCicAQicZjNkxF0SUdVFW2KydVqRUQNfdehZA4kYP4gHtlqJ/ZbgUxLFaKeU/Vyf mM05w/JnBbyhvWFe97vjU6pGbmoqcoNv+Hn7b+YAXRGfVKfvTeFM+gVrA91qo5gUniGP fdLWg1iAMVwkugmRrMO7e3PJ5IWQYxx2hAmPJsBYiQSoUEBfoVy31ToG343KnBxarFgW uXxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Muz0iZYPBEL4BjLhrH/LWskHKQEsZ1MozpD+q6Dp2Ys=; b=QZ8ADPYqBmsPMmMuz+5VIMFd7fMCgYT07aDc+Rncy9vifIN71BPqk/erhs643gB9ar V7PS7DLMkgEu+KJ/AfBN4+fo2jnGYvbVEhSQe7+FYk5OhxrZcjGS03qaBCI8xjzAA3lM VO5eyuFh2Qxa3lzl1k4ztjiUYUOuFGn48z06gHV+af/OS87+yROxS+ZHKiV9D64b47Cs KkrfnILZC1jHf31x9lUeAiI/3kuocYyUR8qXH7/nse75D6fEsEdWY5aIbFw6A70oyZuB tbEPj/9FAM0yXVOKfSurgVxJOwNqHZFvuTXTxzPPlRy/n8CNokZuJfl5a48UgxkVh11I hYcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Z6ZsnWzi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sc6-20020a1709078a0600b00783d5a873dcsi6121224ejc.341.2022.09.28.13.51.03; Wed, 28 Sep 2022 13:51:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Z6ZsnWzi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231652AbiI1Utz (ORCPT + 99 others); Wed, 28 Sep 2022 16:49:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231577AbiI1Utu (ORCPT ); Wed, 28 Sep 2022 16:49:50 -0400 Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A4C8DE84 for ; Wed, 28 Sep 2022 13:49:45 -0700 (PDT) Received: by mail-pf1-x42d.google.com with SMTP id y136so13568537pfb.3 for ; Wed, 28 Sep 2022 13:49:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=Muz0iZYPBEL4BjLhrH/LWskHKQEsZ1MozpD+q6Dp2Ys=; b=Z6ZsnWziaTdbzfdsBmqxxlVloFWWTaG/m8HNXNrbs04EjLlp5GPkv3zsGaL0x+fR2U a0vj99rS+jqi+9xktdZtMYu22HAp7b+RAjxI72zP0nHNvP01r6Q5n8vIx+A6NaIOiSn8 PT3yJM4xFjAeP6tCZTS3IoOESWr/CAb9by+CyaIWWnCcNCIxLVlCVUTXEewqt3ZAz8nh jS2O1T7C9Tc7tkxoVH3RKoHoq/GRMe7skP66vnShBM2IqNCY1TP48Nx8JzZOjfQdRT79 rGHidppoLrZfotRyuLqWJGo4SQ972tqOR9KvE1PPsvH4fZt9LJ5Ny6lOBzGwKSDmiO84 dBHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=Muz0iZYPBEL4BjLhrH/LWskHKQEsZ1MozpD+q6Dp2Ys=; b=VHMLX93KpjeY4r/ab5FUD5NYs9B5xSesEyV9wf63q1MHjsS/wITPyZJ62Z8kLXYcCi tJFuaP8PVesJ7pLOmtaG8EtQTz/IS0hRlM7eIXA8NPa3TbwexAMrCBy03cKGlUU4QTjQ zhZ6YyodkkQ7WnlMm7QsauWFjoyhru8eN2u9Ursl0eoG5fulzDOkxuhaYk7L0ii+aP+4 njN+58Zsnfm/xCVEZ7Mp9StfoX7Lmkb6OfO3Cc5PRhP9Fae6gHgVx41XH9CQ9tgg2/ND hTHkyBVYhjJ7EHxcm/3ddQf/767wL5N1Lvp6jh1JMyAKg4zKMBqXizFO88V8GYCZAIqr VRfg== X-Gm-Message-State: ACrzQf0hHIacJq3EbnK+D1ssRKut+NL2CaYtBhuUGtSyfFEecL924DJZ NxLmhp6WXip+mL86Dh6BjyrxnSYBirnwzq9+X+1AzQ== X-Received: by 2002:a62:1ad5:0:b0:540:4830:7df6 with SMTP id a204-20020a621ad5000000b0054048307df6mr35612367pfa.37.1664398184344; Wed, 28 Sep 2022 13:49:44 -0700 (PDT) MIME-Version: 1.0 References: <202209271333.10AE3E1D@keescook> <20220927210248.3950201-1-ndesaulniers@google.com> In-Reply-To: From: Nick Desaulniers Date: Wed, 28 Sep 2022 13:49:32 -0700 Message-ID: Subject: Re: [PATCH v3] x86, mem: move memmove to out of line assembler To: Rasmus Villemoes Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H . Peter Anvin" , Peter Zijlstra , Kees Cook , linux-kernel@vger.kernel.org, Linus Torvalds , llvm@lists.linux.dev, Andy Lutomirski Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 28, 2022 at 12:06 PM Nick Desaulniers wrote: > > On Wed, Sep 28, 2022 at 12:24 AM Rasmus Villemoes > wrote: > > > > On 27/09/2022 23.02, Nick Desaulniers wrote: > > > > > + /* > > > + * Handle data forward by movs. > > > + */ > > > +.p2align 4 > > > +.Lforward_movs: > > > + movl -4(src, n), tmp0 > > > + leal -4(dest, n), tmp1 > > > + shrl $2, n > > > + rep movsl > > > + movl tmp0, (tmp1) > > > + jmp .Ldone > > > > So in the original code, %1 was forced to be %esi and %2 was forced to > > be %edi and they were initialized by src and dest. But here I fail to > > see how those registers have been properly set up before the rep movs; > > your names for those are tmp0 and tmp2. You have just loaded the last > > word of the source to %edi, and AFAICT %esi aka tmp2 is entirely > > uninitialized at this point (the only use is in L16_byteswap). > > > > I must be missing something. Please enlighten me. > > No, you're right. It looks like rep movsl needs src in %esi and dest > needs to be in %edi, so I can't reuse the input registers from > -mregparm=3; a pair of movs is required. A v4 is required. > > Probably should write a test for memcpy where n > magic constant 680. This unit test hangs with v3 (and passes with my local v4 which I haven't sent out yet): ``` index 62f8ffcbbaa3..c2e852762846 100644 --- a/lib/memcpy_kunit.c +++ b/lib/memcpy_kunit.c @@ -107,6 +107,8 @@ static void memcpy_test(struct kunit *test) #undef TEST_OP } +static unsigned char larger_array [2048]; + static void memmove_test(struct kunit *test) { #define TEST_OP "memmove" @@ -181,6 +183,20 @@ static void memmove_test(struct kunit *test) ptr = &overlap.data[2]; memmove(ptr, overlap.data, 5); compare("overlapping write", overlap, overlap_expected); + + /* Verify larger overlapping moves. */ + larger_array[256] = 0xaa; + memmove(larger_array, larger_array + 256, 1024); + KUNIT_ASSERT_EQ(test, larger_array[0], 0xaa); + KUNIT_ASSERT_EQ(test, larger_array[256], 0x00); + KUNIT_ASSERT_NULL(test, + memchr(larger_array + 1, 0xaa, ARRAY_SIZE(larger_array) - 1)); ``` I'll include the tests in my v4, including another for overlapping memmove forwards. -- Thanks, ~Nick Desaulniers