Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp4195235rwl; Mon, 10 Apr 2023 07:32:11 -0700 (PDT) X-Google-Smtp-Source: AKy350YiE72gShDWzcUxmtzQzqax+snPGnF/Hktq72rkPH29R66jvEtyvAvP421Qr01rg3rDwjzw X-Received: by 2002:a17:902:e549:b0:19a:f9b5:2f2f with SMTP id n9-20020a170902e54900b0019af9b52f2fmr13862044plf.55.1681137131154; Mon, 10 Apr 2023 07:32:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681137131; cv=none; d=google.com; s=arc-20160816; b=STDMyRQKtmZzECoXZS1OsGkcDsp4PTKMi66ZqJF7a40WTnX6L1daFRkwhBd0tzxQtj 0AQkWK44P1KrBQmto+7JkyAST31TQlhtnwxp9u5j5p3G2ay9GWV0/sAU77OgfRF9G4s0 DE+7rf+YgfLtbxFIeQtT9oIXTLc1IA9Jrv18S2/xpL+sJen0328Ss9YnrnTJ9uwNBr7b 4GgqnosUUfUOPt1pFeXP55LxdZt9g1hOFXW3BHt1mcfe3dttvgYgUrH12Yfv3aBdGu8w 1jwlbDJPjDQQvxcCzugDOHe4sOfFQk/hfZdaeYRiYosx/b0QhR+po7KQBygLtZPSD1JZ +Z8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version; bh=EEhHbeUY/2b7G7q/GYknpGpls+Z3sA6mXOLfdKwu07M=; b=wjtR228S3awP2G+a7QpRIox4dpzy4anoXLAqG7E4D4hCLsrNMJtWP2Una74aX+uSqZ K6QFWmrV2P3OGMKJ07+43jynJ8vhWpReMgaF6dRDEUFKw7vq28e1E8GHOCk5M4LzB+OJ B8Lnn/fKQzzFu4LRaItSH6rZB1sD9Rk8kF+VA03oy8hVuTU7mWkoeTXZ6+76SxRHXG0M QTQMzkhgjQZH1bkEvSBdw+cDyRzx9U1yg/FbIRDCwfziIH4dRjEu59M6E7/7snDhSNp3 YLHkPv4I8ayBWnFNSB5zVVD2GCgKcqFeuG0YIS5R33vd+Xs5dv44LXryT2H7geBF8uZ8 UjJg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j186-20020a638bc3000000b005195b49d9d6si3569889pge.511.2023.04.10.07.31.58; Mon, 10 Apr 2023 07:32:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229767AbjDJOXA convert rfc822-to-8bit (ORCPT + 99 others); Mon, 10 Apr 2023 10:23:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229592AbjDJOW6 (ORCPT ); Mon, 10 Apr 2023 10:22:58 -0400 Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0DD03C1 for ; Mon, 10 Apr 2023 07:22:56 -0700 (PDT) Received: from loongson.cn (unknown [209.85.128.49]) by gateway (Coremail) with SMTP id _____8AxJPy+GzRk7iEZAA--.38772S3; Mon, 10 Apr 2023 22:22:55 +0800 (CST) Received: from mail-wm1-f49.google.com (unknown [209.85.128.49]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Cxtry7GzRkVKccAA--.28094S3; Mon, 10 Apr 2023 22:22:54 +0800 (CST) Received: by mail-wm1-f49.google.com with SMTP id eo6-20020a05600c82c600b003ee5157346cso4381851wmb.1 for ; Mon, 10 Apr 2023 07:22:54 -0700 (PDT) X-Gm-Message-State: AAQBX9es+pR5TCCO3/Kevsirz2khNWiI/l6qxtMRjM8qC50me6+9/0tf D8cQ7FXTi9xeV4LS2rhZvJh0k/mvH8T3rp/fIq6I4g== X-Received: by 2002:a1c:4c10:0:b0:3ed:7664:6d79 with SMTP id z16-20020a1c4c10000000b003ed76646d79mr1560595wmf.0.1681136570453; Mon, 10 Apr 2023 07:22:50 -0700 (PDT) MIME-Version: 1.0 References: <20230410115734.93365-1-wangrui@loongson.cn> In-Reply-To: From: Rui Wang Date: Mon, 10 Apr 2023 22:22:39 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] LoongArch: Improve memory ops To: Xi Ruoyao Cc: Huacai Chen , WANG Xuerui , loongarch@lists.linux.dev, linux-kernel@vger.kernel.org, loongson-kernel@lists.loongnix.cn Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-CM-TRANSID: AQAAf8Cxtry7GzRkVKccAA--.28094S3 X-CM-SenderInfo: pzdqw2txl6z05rqj20fqof0/ X-Coremail-Antispam: 1Uk129KBjvdXoW7JFy5Kw1rKrW5ZryrXFyftFb_yoW3uwbE93 WkK3yDuw4DJFZ7Gan8Kr47Ary3WFW5WF10kw1jgr4akryrXF1vvF1kWFy3Zryxta9YgF1D GrWYqF18AasFvjkaLaAFLSUrUUUUbb8apTn2vfkv8UJUUUU8wcxFpf9Il3svdxBIdaVrn0 xqx4xG64xvF2IEw4CE5I8CrVC2j2Jv73VFW2AGmfu7bjvjm3AaLaJ3UjIYCTnIWjp_UUUY 27kC6x804xWl14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0rVWrJVCq3w AFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l84ACjcxK 6xIIjxv20xvE14v26r4j6ryUM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4j6F4UM28EF7 xvwVC2z280aVAFwI0_Cr1j6rxdM28EF7xvwVC2z280aVCY1x0267AKxVWxJr0_GcWle2I2 62IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2xF0cIa020Ex4CE44I27wAqx4xG64xvF2IEw4 CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvj eVCFs4IE7xkEbVWUJVW8JwACjcxG0xvEwIxGrwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4 IE7xkEbVWUJVW8JwCFI7km07C267AKxVWUAVWUtwC20s026c02F40E14v26r1j6r18MI8I 3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIx AIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1lIxAI cVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2js IEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x07UE-erUUUUU= X-Spam-Status: No, score=-0.0 required=5.0 tests=SPF_HELO_PASS,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 10, 2023 at 8:54 PM Xi Ruoyao wrote: > Regarding these functions: we have -ffreestanding which is preventing > the compiler from optimizing for e.g. "memcpy(a, b, 8);" into a simple > ld.d/st.d pair. A explicit compiler built-in usage like > > #define memcpy(a, b, c) __builtin_memcpy(a, b, c) > > would allow the compiler to do such kind of optimization. Will this > improve the performance? That's a good question. IIUC, the current compiler generates inefficient code for constant-length memcpy, which may sacrifice performance for compatibility, as not all hardware supports unaligned memory access. We need a runtime CPU feature dispatch mechanism similar to alternatives to improve the compiler. This is indeed a problem that needs to be addressed. Regards, Rui