Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp82507rwb; Wed, 18 Jan 2023 14:30:04 -0800 (PST) X-Google-Smtp-Source: AMrXdXuFK2G/5xBCov+TweExu2DTiN6AADwc92jnc+8n6pTnI9hCcjZCe+pIp1O9HYyU9O9iQGEh X-Received: by 2002:a17:903:108c:b0:194:9ddd:9acf with SMTP id u12-20020a170903108c00b001949ddd9acfmr6884320pld.2.1674081004682; Wed, 18 Jan 2023 14:30:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674081004; cv=none; d=google.com; s=arc-20160816; b=zFZLFdJtg3FvlNxtcsnqOeuvJYpXehZzgcnOkIojpQe2Uth14c6T/w8nQdA681wAO0 dVe0wKuvhfNaql0kJ5QR8QYpkHGQI4LDwdfgjng3LqNOmlV/Tio1eas1Gk6eY/7hk361 eqHZe6H8jgLirmaseBGLj9jNxoCFWyS2KinaGYExO88Qujalu8vYBw1iZHFdu4gqvwxX kxD85RFSLpPDtK29eGDj5zbXa0VX7oBkyxO3hPPYmGIhRTCiP2aoHfeRU5xaRtyM/RPf qaDv74HdjzeD5Y/Yvst6Yn5ljV2ZBwVunmfmegbXDmbWzUd1assL/C+XlosA2e1PzdCi sGeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Qtq4bTyQy0W4Bsf1mh5rx8LNnKznTvqqjHsmXtsh7DY=; b=Gtp9jSeFvW2XAcsYai8ZHNJgJz5rMIAP0ppSjCwTffMpNfEiHe5ErcCY0XR4eOvNUc XRCpS2bjPxurYRYXvtkUa0GpWcAUlnd6xHsnrnQ06RD0yHuaT4iD7B+68GDTBzJjqnjW dJJYB1QSKAKsM2ZkLNvxk+it5cKtNCFgHt0rDzYL88pISlbofe3T/OKqYMnVcxvM84Sd IS8SceIPFvOnJH2eO1b2p1w1HEZaX4hNP3DCgcKsJjSzM1MbzhEuvSrnf47Z728+njUx SNkh3cg9td9YVthbjvVOlXJn0tMSBBYlT53yGkyERIjfHvYSa8rzZ6LLC4NYniyPJmii opvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=IAT5APY9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h10-20020a170902f54a00b00194478505e0si27881884plf.278.2023.01.18.14.29.58; Wed, 18 Jan 2023 14:30:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=IAT5APY9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229946AbjARWBu (ORCPT + 46 others); Wed, 18 Jan 2023 17:01:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229971AbjARWBn (ORCPT ); Wed, 18 Jan 2023 17:01:43 -0500 Received: from mail-yb1-xb32.google.com (mail-yb1-xb32.google.com [IPv6:2607:f8b0:4864:20::b32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DAA835EFAC for ; Wed, 18 Jan 2023 14:01:41 -0800 (PST) Received: by mail-yb1-xb32.google.com with SMTP id l139so184652ybl.12 for ; Wed, 18 Jan 2023 14:01:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Qtq4bTyQy0W4Bsf1mh5rx8LNnKznTvqqjHsmXtsh7DY=; b=IAT5APY9WkjsiY4I6E0A+2MrwUYt2keZusJ6Z/9Twqhz6K1BzWrNvV/ddK0s3xkoca YntymHOvR/VNglekh3ZntKkkb1G9NVurcdaOmb+Md0YgJIYb2ldm6rVJW20LZiLC/uVZ gUA1FZQawq07+xrTuUkJ5JGs5Z+4iq4NmHh/lj5ReqVsVMOZn5no+MQjDEiuEHsprp5g tQdy7XV79TEyJHWBa3PUGdCwt6AkTZi+FGWzdcA4NiOYfWprnaf0bFWIplwonrwHNbIw r+VMmO10hBMU/tVLwNy5/W8bQmNPcpL+DiqwS1NOlGhOhS+INBm0Bb6H3wYD2RMCxz6E fVMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Qtq4bTyQy0W4Bsf1mh5rx8LNnKznTvqqjHsmXtsh7DY=; b=JC5SDEJ9u42RwsY0DVMVHWxhPVPw0+NRzX9B4aIWoiXneNQ6Ddo90yQAnjf8ITfXhC s4a4NIX41H4FzJ59wDS/Aw2pLnQrfTG2bRR0hBv0l4rDjMfnDji4tS3DCUPCzx/PGk9h F0xjXXAYcX81Ixyz+W1iRbFXAr/REpdswAd/ZFOWSVJUoAhLKhcD8VIKT530k331yBL/ AWHxdk3UJz3FWsYZOpWlPMtiBzP60Id6deZ00bCR13c/DGAGLefDEeU5wCO6ew2AFBvp UOKU1xH79agELhG87RRv5qh6JSfceWYSRtWrcRiwaB8oCxreFbTW9g+PoLrDw0LHJlWO Ioog== X-Gm-Message-State: AFqh2kqVhiGrU4XIxK9liBgLp4S1TZyFkPB1fM/dOIJj8LwWf2hYQLxZ tz7b5RmDU+1eA6FhweAT/nHkvxg6nsK8sJWMkSO1GyYN5q8= X-Received: by 2002:a25:740b:0:b0:7b6:9dcb:6588 with SMTP id p11-20020a25740b000000b007b69dcb6588mr977882ybc.251.1674079301023; Wed, 18 Jan 2023 14:01:41 -0800 (PST) MIME-Version: 1.0 References: <20230118150703.4024-1-ubizjak@gmail.com> <20230118131825.c6daea81ea1e2dc6aa014f38@linux-foundation.org> In-Reply-To: From: Uros Bizjak Date: Wed, 18 Jan 2023 23:01:29 +0100 Message-ID: Subject: Re: [PATCH] lib/genalloc: use try_cmpxchg in {set,clear}_bits_ll To: Andrew Morton , Linus Torvalds , Mateusz Guzik Cc: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 18, 2023 at 10:55 PM Uros Bizjak wrote: > > On Wed, Jan 18, 2023 at 10:47 PM Uros Bizjak wrote: > > > > On Wed, Jan 18, 2023 at 10:18 PM Andrew Morton > > wrote: > > > > > > On Wed, 18 Jan 2023 16:07:03 +0100 Uros Bizjak wrote: > > > > > > > Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in > > > > {set,clear}_bits_ll. x86 CMPXCHG instruction returns success in ZF > > > > flag, so this change saves a compare after cmpxchg (and related move > > > > instruction in front of cmpxchg). > > > > > > > > Also, try_cmpxchg implicitly assigns old *ptr value to "old" > > > > when cmpxchg fails. > > > > > > > > Note that the value from *ptr should be read using READ_ONCE to prevent > > > > the compiler from merging, refetching or reordering the read. > > > > > > > > The patch also declares these two functions inline, to ensure inlining. > > > > > > But why is that better? This adds a few hundred bytes more text, which > > > has a cost. > > > > Originally, both functions are inlined and the size of an object file > > is (gcc version 12.2.1, x86_64): > > > > text data bss dec hex filename > > 4661 480 0 5141 1415 genalloc-orig.o > > > > When try_cmpxchg is used, gcc chooses to not inline set_bits_ll (its > > estimate of code size is not very precise when multi-line assembly is > > involved), resulting in: > > > > text data bss dec hex filename > > 4705 488 0 5193 1449 genalloc-noinline.o > > > > And with an inline added to avoid gcc's quirks: > > > > text data bss dec hex filename > > 4629 480 0 5109 13f5 genalloc.o > > > > Considering that these two changed functions are used only in > > genalloc.o, adding inline qualifier is a win, also when comparing to > > the original size. > > BTW: Recently, it was determined [1] that the usage of cpu_relax() > inside the cmpxchg loop can be harmful for performance. We actually > have the same situation here, so perhaps cpu_relax() should be removed > in the same way it was removed from the lockref. > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f5fe24ef17b5fbe6db49534163e77499fb10ae8c I forgot to add some CCs that may be interested in the above. Uros.