Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp222396pxm; Fri, 25 Feb 2022 06:56:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJwbrG2tTMXsmVi54XBtwdicwHMcYvirUeLijtFI8awtIKWqDNcI6Ak8CXqHwLxn944f4yVG X-Received: by 2002:a17:902:a508:b0:14d:bfd8:58be with SMTP id s8-20020a170902a50800b0014dbfd858bemr7979994plq.10.1645800981459; Fri, 25 Feb 2022 06:56:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645800981; cv=none; d=google.com; s=arc-20160816; b=W0L5CIHuk/oP87/DqjsHx/KUIAWZdSkdm9o7pxRoY/8dBcJ0BQDpBcc9VP57+4ZDgv 8MEzy6kTGE+eZ7367gOA+B+hx6yen86WNsHGHb3XMjXV0y1rarsZCVEnbnJvLKewoEYg QYOIZTey12piNJfy0yX/eTOWKU55G53+kiep2mXwE3qGzgUuZC20D7ojM060X7kTG2Zd FO5D790rd4Fexj9PJCQvfVP/xQ3XLCOJcOZOSUrh22U8afq1gjuwbhPvlUv0NQiMlQ49 eXfHuNkMIh0cMQa5woUfcHmHx9UAYQDsAp38ZkPRrDW0djFkI7xat1wiZ+aF/Vtstz/u z6KA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=KuLFG9NTpKcLVnRYyUUQhtRmRBUPOcoQBht+YhsQp/g=; b=cuX4535pCacinWAhiCSDMzD6uj3k/xa1XV+DqqIrxlm+Fb7BVz+QXucPanfAjiQ5st E/KeNXFEccd+7uBwO/jGQFD8S6ffMzZbQn4OG8UW2LoKCJC3w5xPhq1aJVODuDfiPliR QEBrjBSyDV8AnOb1BAis6mLQWNQzCDkavMgAZZFCo4Dn/tM+S8aTXBknGr2OmqWxnd2T XtX6nnQGm/w/yaScz45JoHcc5n0r9bcSQk3OL3ZYwm5bzuvE9VD5cR0yGBVB1Q1+akNS I46nbrzRwDtG2AILhWoTrQqFSm5GSjK392cm1JOZy6RGOYfmuCf1PtacE1djiJ/m24Yw ejLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@monstr-eu.20210112.gappssmtp.com header.s=20210112 header.b=V5BaqrKW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xilinx.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cm20-20020a17090afa1400b001bc9fa6fae4si2046282pjb.117.2022.02.25.06.56.05; Fri, 25 Feb 2022 06:56:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@monstr-eu.20210112.gappssmtp.com header.s=20210112 header.b=V5BaqrKW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xilinx.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234130AbiBYN4W (ORCPT + 99 others); Fri, 25 Feb 2022 08:56:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232498AbiBYN4P (ORCPT ); Fri, 25 Feb 2022 08:56:15 -0500 Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4BBAD20DB30 for ; Fri, 25 Feb 2022 05:55:42 -0800 (PST) Received: by mail-ed1-x52b.google.com with SMTP id h15so7534724edv.7 for ; Fri, 25 Feb 2022 05:55:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monstr-eu.20210112.gappssmtp.com; s=20210112; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KuLFG9NTpKcLVnRYyUUQhtRmRBUPOcoQBht+YhsQp/g=; b=V5BaqrKWmIfKjQDH7LspDEVFF2LgoNQfhbLzPmlr8NgNs3nJBMYAlYsFpdgp2IsUTB ASDRmiLH1gQQ1xLkLEkW4UTKkvR8uVoNmtBiDCty9HV/6sJx4PVk6c8sgD3hloAis0Fe 6vFeWzQn6WsxRi20zYZ6/tesKnA2+/fedGTIuTbRzFhfxGt90s90ZMgYACUkBY0LUv0g 7AxKw4ayG5DW2tLuh9Vvit95OEjXnQqKdDqlyPortoBFud8zOYqULYzuunEtYC+A42mc 2mX/AMNC/DCY8dmzNLmSQ0wCrmh23TA5CSTggzMg92dlGnq04aZrHxViF2jFaedt50tg MYpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=KuLFG9NTpKcLVnRYyUUQhtRmRBUPOcoQBht+YhsQp/g=; b=lLrJGCpp3KM4axhjnzYirWNKRs0PuPKx4z4zH7UDSvXkyTOXn24QrBguQWCoG50SxF 3uhTz10eDxR4shu/w+rDdHqgV26o7oc8n9FwaHlMYUcy12rZRnZiraQIVcp9yo97mnuZ gA9WfJY0YItlG8KVobEuzYBm1nooFQpat+O2UYCu0H+gwz2qVorfkUS/BOs4OeFmcGXm DuOKqAX3GiVuaBlvmEGDZDqxv85367dk1YZhZDG82XkQj1MfgCXlavO/04sy5PHkb+t0 lDDW5qBmmuRIZvz7mJ2vK52ylr1aGcLm+habdbxTQeTdT00rv8FRRdXQ6PaY4rGNbZtt KfyA== X-Gm-Message-State: AOAM532D+XMLkzW+7thG677107jDQJ1jFXBMGypfWO3uxeu2IV11HOvd DZTNr9ZhfSyF8L8qBM0a/D/0rbkkRi3A+Q== X-Received: by 2002:a05:6402:369c:b0:413:2bc0:3f00 with SMTP id ej28-20020a056402369c00b004132bc03f00mr7303725edb.126.1645797340664; Fri, 25 Feb 2022 05:55:40 -0800 (PST) Received: from localhost ([2a02:768:2307:40d6::f9e]) by smtp.gmail.com with ESMTPSA id bm5-20020a170906c04500b006ce6f8892a5sm1056649ejb.7.2022.02.25.05.55.40 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 25 Feb 2022 05:55:40 -0800 (PST) Sender: Michal Simek From: Michal Simek To: linux-kernel@vger.kernel.org, monstr@monstr.eu, michal.simek@xilinx.com, git@xilinx.com Cc: Mahesh Bodapati , Randy Dunlap Subject: [PATCH v2 2/3] microblaze: Do loop unrolling for optimized memset implementation Date: Fri, 25 Feb 2022 14:55:35 +0100 Message-Id: <10a432e269a6d3349cf458e4f5792522779cba0d.1645797329.git.michal.simek@xilinx.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Align implementation with memcpy and memmove where also remaining bytes are copied via final switch case instead of using simple implementations which loop. But this alignment has much stronger reason and definitely aligning implementation is not the key point here. It is just good to have in mind that the same technique is used already there. In GCC 10, now -ftree-loop-distribute-patterns optimization is on at O2. This optimization causes GCC to convert the while loop in memset.c into a call to memset. So this optimization is transforming a loop in a memset/memcpy into a call to the function itself. This makes the memset implementation as recursive. "-freestanding" option will disable the built-in library function but it has been added in generic library implementation. In default microblaze kernel defconfig we have CONFIG_OPT_LIB_FUNCTION enabled so it will always pick optimized version of memset which is target specific so we are replacing the while() loop with switch case to avoid recursive memset call. Issue with freestanding was already discussed in connection to commit 33d0f96ffd73 ("lib/string.c: Use freestanding environment") and also this is topic in glibc and gcc. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888 http://patchwork.ozlabs.org/project/glibc/patch/20191121021040.14554-1-sandra@codesourcery.com/ Signed-off-by: Michal Simek Signed-off-by: Mahesh Bodapati --- Changes in v2: - missing patch in v1 arch/microblaze/lib/memset.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/arch/microblaze/lib/memset.c b/arch/microblaze/lib/memset.c index 615a2f8f53cb..7c2352d56bb0 100644 --- a/arch/microblaze/lib/memset.c +++ b/arch/microblaze/lib/memset.c @@ -74,8 +74,19 @@ void *memset(void *v_src, int c, __kernel_size_t n) } /* Simple, byte oriented memset or the rest of count. */ - while (n--) + switch (n) { + case 3: *src++ = c; + fallthrough; + case 2: + *src++ = c; + fallthrough; + case 1: + *src++ = c; + break; + default: + break; + } return v_src; } -- 2.35.1