Received: by 2002:a05:7412:1703:b0:e2:908c:2ebd with SMTP id dm3csp3985021rdb; Wed, 30 Aug 2023 11:47:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEGMvChNuyLp1VjmtOIema0Tabr8tWSqoKrPECU9e14xbJxZRR1ydAhJJhKZntVopwzWpbw X-Received: by 2002:a05:6a20:2585:b0:132:bdba:5500 with SMTP id k5-20020a056a20258500b00132bdba5500mr3538519pzd.39.1693421237610; Wed, 30 Aug 2023 11:47:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693421237; cv=none; d=google.com; s=arc-20160816; b=BdrACrYjzFlFPNjnP2Ha/vLkKc6vZ3KiENVKEuTSuY+iVYLpViykqaoRU71b2t4S+C agraXV8xJ2gkyARDobdfPw288nfR4MXHZGK5WwSjxzq8ac1mnlnaokhabn+suYEgBvlP 8nrYzi8FfFJOw5VIoZWLZSRsSqIyCVCBBscZzosat+kvsFcUgKYpyjTzSz99uOEu4ZGL +0/rUREYD2FMWYJBc3ueMUlzjdwBrX3PYSRG+a2NoEHc7oXzzpYR3sSCfq5m8LbORLuS 3LpZ5BI0uziKUo2od/awr+64/UQBJ5HsW06IUDz4wrJXfNooKbykNe6PPkkeDVOlPvuM xKNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=//zsctbc/9E+fDEErTWrkRacF+QzGRMOvTwCYo96K1A=; fh=deNuThkGk/OzRGT+8OCRIjfmF5QaoyJPrjZsFGKbfMA=; b=dV/MCdenbUy+HdWWjayazC44AqaLcb6iiV9uIcpH1zxurWtCFWrK/5S98WRfO73K9V fCD0aqPPNa2VhyvvCDhImRcu4ybUmWvXoX9LimLQL5rSYxenhOG7InnfDjnPlA7eQkKo i5zs5y1OZCJ1pUf/mRjxHdNA24r58voTgIZz2UmFdxLdcnw7yNXBpruVbuvtltwKwYEO QupjJzXAXEePnlX2uHqWg3dHi0B8KOIjJ12oUbiWr7Ns7TTOQoemCpRwdX4/p7UGM0SP q1Lhvwcr1rRA7ZBpvtBV+ogZ4DB57Hbs7p7vxprELCneHsUsdtyVj/tXgTP/R9agoAf2 xVCw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m1-20020a656a01000000b0053f479ef142si4259946pgu.124.2023.08.30.11.47.02; Wed, 30 Aug 2023 11:47:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235802AbjH3SgK (ORCPT + 99 others); Wed, 30 Aug 2023 14:36:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245507AbjH3PXq (ORCPT ); Wed, 30 Aug 2023 11:23:46 -0400 Received: from 1wt.eu (ded1.1wt.eu [163.172.96.212]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2E92783 for ; Wed, 30 Aug 2023 08:23:40 -0700 (PDT) Received: (from willy@localhost) by mail.home.local (8.17.1/8.17.1/Submit) id 37UFNMms027896; Wed, 30 Aug 2023 17:23:22 +0200 Date: Wed, 30 Aug 2023 17:23:22 +0200 From: Willy Tarreau To: Ammar Faizi Cc: Alviro Iskandar Setiawan , Thomas =?iso-8859-1?Q?Wei=DFschuh?= , Nicholas Rosenberg , Michael William Jonathan , GNU/Weeb Mailing List , Linux Kernel Mailing List Subject: Re: [RFC PATCH v1 2/5] tools/nolibc: x86-64: Use `rep stosb` for `memset()` Message-ID: References: <20230830135726.1939997-1-ammarfaizi2@gnuweeb.org> <20230830135726.1939997-3-ammarfaizi2@gnuweeb.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 30, 2023 at 10:09:51PM +0700, Ammar Faizi wrote: > On Wed, Aug 30, 2023 at 09:24:45PM +0700, Alviro Iskandar Setiawan wrote: > > Just a small idea to shrink this more, "mov %rdi, %rdx" and "mov %rdx, > > %rax" can be replaced with "push %rdi" and "pop %rax" (they are just a > > byte). So we can save 4 bytes more. > > > > 0000000000001500 : > > 1500: 48 89 f0 mov %rsi,%rax > > 1503: 48 89 d1 mov %rdx,%rcx > > 1506: 57 push %rdi > > 1507: f3 aa rep stos %al,%es:(%rdi) > > 1509: 58 pop %rax > > 150a: c3 ret > > > > But I know you don't like it because it costs extra memory access. > > Yes, that's an extra memory access. But I believe it doesn't hurt > someone targetting -Os. In many cases, the compilers use push/pop to > align the stack before a 'call' instruction. If they want to avoid extra > memory access, they could have used "subq $8, %rsp" and "addq $8, %rsp". Then "xchg %esi, %eax" is just one byte with no memory access ;-) Willy