Received: by 2002:a05:7412:f584:b0:e2:908c:2ebd with SMTP id eh4csp1218988rdb; Mon, 4 Sep 2023 06:49:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF7jgYJ428A9X1llrughRjwJP6rlson+cvETPrNTa0lu4j6+THLcyFQl5YRvTTJYsR3RsQY X-Received: by 2002:aa7:df82:0:b0:52a:1c3c:2ecc with SMTP id b2-20020aa7df82000000b0052a1c3c2eccmr7103568edy.25.1693835344032; Mon, 04 Sep 2023 06:49:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693835344; cv=none; d=google.com; s=arc-20160816; b=0AxfO+z+vQ0GcuDZkGBifT15GJZiua0WI52YgdgVjta5uiOy4d+HccqD0RHYHVuXWn LopUyLUlS/Cb8dUie/ufdxTc3jmChUG1OWiCSZDrDKMFoUlNzw3JV3h9uoyGEG8OOg/q DKKsyl9g3y4hwoOH4N1DiT7bYjFlChAbsLxA/Ij8hk1Lc5olNZXPbU5lxDDZ++w4ZOOF 7ejuivNByj0QSS5hMpjwZcApI6DDX4aoNRwM+4zz93bg2qHwVl6tNKpDjGJiYXJA31B8 l7i2wyfSdcyF+rED9IgilWSLPQ3kyHI1yWeb+CXM+ZBaQlDQSteK9uaN0fnnXrSCSzhy tGjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=4lcMFkSew2YU3xdkbfjQplzszZ5ykdemj4ksSMStkcc=; fh=XfFTwlH2R12yLltwQt+OyrjoDuQQkZ70hGqFUwo4WY8=; b=1H4O3W2K8VdCSCqPIOP4QtTqVGkdU7DIjSMSB3PxUFO/t1gXV0cxeO5j6O26oKm2Cp Ls0Yu2Ybs0MIu+ORCFKVjfWs0huTPv5gNuMhaNNiGBB/V59O/QckJk+UZ6/6/+xqlI1i F4QGzUSUEKpGP418aFI2QH9IJIanTU3rKHdCy2wFA2JStgUZkkpf4RTULWd7Ci4BXfgU dmAmRyQqX79kmAc4FwFTXl+7dtEtrFtiMFc9TridnXZEz4GxvUfc0J6z+Ygxl1jSzXJj KF/qq4n3rbW3yGoVDc99tYiLNxMrSdErqvQn4YfTBtaZXtgHDkqsYA9o+GAAd1w7ydoA n1Pw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z20-20020aa7c654000000b0052a3c7c9e8dsi6389714edr.659.2023.09.04.06.48.34; Mon, 04 Sep 2023 06:49:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240820AbjIDI1K convert rfc822-to-8bit (ORCPT + 99 others); Mon, 4 Sep 2023 04:27:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231621AbjIDI1J (ORCPT ); Mon, 4 Sep 2023 04:27:09 -0400 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.86.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C69DE12E for ; Mon, 4 Sep 2023 01:27:03 -0700 (PDT) Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-165-xe_fBOK-OuSM-5QU5OdlUg-1; Mon, 04 Sep 2023 09:27:00 +0100 X-MC-Unique: xe_fBOK-OuSM-5QU5OdlUg-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Mon, 4 Sep 2023 09:26:49 +0100 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Mon, 4 Sep 2023 09:26:49 +0100 From: David Laight To: 'Willy Tarreau' , Ammar Faizi CC: =?iso-8859-1?Q?Thomas_Wei=DFschuh?= , "Nicholas Rosenberg" , Alviro Iskandar Setiawan , Michael William Jonathan , GNU/Weeb Mailing List , Linux Kernel Mailing List Subject: RE: [RFC PATCH v1 3/5] tools/nolibc: x86-64: Use `rep cmpsb` for `memcmp()` Thread-Topic: [RFC PATCH v1 3/5] tools/nolibc: x86-64: Use `rep cmpsb` for `memcmp()` Thread-Index: AQHZ25BVzjdSsUQiYE6wDAvKxWgMfbAKWqPw Date: Mon, 4 Sep 2023 08:26:49 +0000 Message-ID: References: <20230830135726.1939997-1-ammarfaizi2@gnuweeb.org> <20230830135726.1939997-4-ammarfaizi2@gnuweeb.org> In-Reply-To: Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Willy Tarreau > Sent: 30 August 2023 22:27 > > On Wed, Aug 30, 2023 at 08:57:24PM +0700, Ammar Faizi wrote: > > Simplify memcmp() on the x86-64 arch. > > > > The x86-64 arch has a 'rep cmpsb' instruction, which can be used to > > implement the memcmp() function. > > > > %rdi = source 1 > > %rsi = source 2 > > %rcx = length > > > > Signed-off-by: Ammar Faizi > > --- > > tools/include/nolibc/arch-x86_64.h | 19 +++++++++++++++++++ > > tools/include/nolibc/string.h | 2 ++ > > 2 files changed, 21 insertions(+) > > > > diff --git a/tools/include/nolibc/arch-x86_64.h b/tools/include/nolibc/arch-x86_64.h > > index 42f2674ad1ecdd64..6c1b54ba9f774e7b 100644 > > --- a/tools/include/nolibc/arch-x86_64.h > > +++ b/tools/include/nolibc/arch-x86_64.h > > @@ -214,4 +214,23 @@ __asm__ ( > > "retq\n" > > ); > > > > +#define NOLIBC_ARCH_HAS_MEMCMP > > +static int memcmp(const void *s1, const void *s2, size_t n) > > +{ > > + const unsigned char *p1 = s1; > > + const unsigned char *p2 = s2; > > + > > + if (!n) > > + return 0; > > + > > + __asm__ volatile ( > > + "rep cmpsb" > > + : "+D"(p2), "+S"(p1), "+c"(n) > > + : "m"(*(const unsigned char (*)[n])s1), > > + "m"(*(const unsigned char (*)[n])s2) > > + ); > > + > > + return p1[-1] - p2[-1]; > > +} > > Out of curiosity, given that you implemented the 3 other ones directly > in an asm statement, is there a particular reason this one mixes a bit > of C and asm ? It would probably be something around this, in the same > vein: > > memcmp: > xchg %esi,%eax // source1 Aren't the arguments in %rdi, %rsi and %rdx so you only need to move the count (below) ? (Looks like you copied memchr()) David > mov %rdx,%rcx // count > rep cmpsb // source2 in rdi; sets ZF on equal, CF if src1 seta %al // 0 if src2 <= src1, 1 if src2 > src1 > sbb $0, %al // 0 if src2 == src1, -1 if src2 < src1, 1 if src2 > src1 > movsx %al, %eax // sign extend to %eax > ret > > Note that the output logic could have to be revisited, I'm not certain but > at first glance it looks valid. > > Regards, > Willy - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)