Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp472397ioo; Thu, 26 May 2022 07:41:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzDrpI2fiitGykFTKH1zrM9xmnwxbaYtR6G4lIt8nPlqnWlHTQEkAfuD43uJ0vOUHl6h1el X-Received: by 2002:a63:10d:0:b0:3fa:fdd0:a8d0 with SMTP id 13-20020a63010d000000b003fafdd0a8d0mr4826560pgb.331.1653576067185; Thu, 26 May 2022 07:41:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653576067; cv=none; d=google.com; s=arc-20160816; b=FwDcIquIGmRVEZvrp4f7ino4tT8P961TVS2VD+NxI2oGzpvT8QrLqApKsgkkIXJwcO pCwzzyf7edEt57wbJI2NijQpC6bl0ObDzlxcRAJTu+uSAlXE5cQ/wbQzenYMhlIYezs6 YTOQkDDlYOyQghCfXkYBV7wLqgYtJt2eyThTRNT49MMx9jVeIJdXglBsfU5dN1+WB9F1 7tKNS6tGCBhaodpTjqil4njmexoTsgZi9Ptx9FvDmzr1eTrx3nrqUHFVHbyErp6kNpyv lcAOMZUrEF6LqZXayvUmUfUTwAFpt5EoTJz0yfGWQDp6b0/6t1ewSn+Vjp9ugkMZOMby QSAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:in-reply-to:subject :cc:to:from:dkim-signature; bh=qdGxLuOgWLxtGXOKT8uILQNK4Btl7FJ5JZwGbkMtGZY=; b=hMEwc1u7eHc8MMzhzHU6MhOPn4D+6kQewRdAN5bL/hQQH4BYI6HY3dhCuQAXe1wfRq kYVnMlR8QCFv9IMrsGD9XLJFEssVJPU/qY/3pnpS6509Rh7/nUsgJbOd1rGaCiUs8ZXD COTjRgODX5nyxiksVapJcrhgzCG1NTYV7gnsz3qYE6FPYZpuJaO4OwboucLl0k5Mk71n vvI4z7sKybcQLFKaNJdpXq3yt0xJYzl8xzjXtCTlFIsebAJXMnOALVCuVCqgrG4UZAvK vdyQkPp5MC1TqPDPXE3MWRMCsqZjxj2difytlSoNPBmGGJXUzatbP36MCCEeOK0SVteP 2yXQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b="gy/HsX4B"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w70-20020a638249000000b003fb210a18c8si1768566pgd.680.2022.05.26.07.40.55; Thu, 26 May 2022 07:41:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ellerman.id.au header.s=201909 header.b="gy/HsX4B"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345978AbiEZMPN (ORCPT + 99 others); Thu, 26 May 2022 08:15:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230487AbiEZMPL (ORCPT ); Thu, 26 May 2022 08:15:11 -0400 Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E97D9C6E51 for ; Thu, 26 May 2022 05:15:09 -0700 (PDT) Received: from authenticated.ozlabs.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mail.ozlabs.org (Postfix) with ESMTPSA id 4L86MB4xFWz4xXj; Thu, 26 May 2022 22:15:02 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ellerman.id.au; s=201909; t=1653567305; bh=qdGxLuOgWLxtGXOKT8uILQNK4Btl7FJ5JZwGbkMtGZY=; h=From:To:Cc:Subject:In-Reply-To:Date:From; b=gy/HsX4BcAReBDnH2meEdqv0xBEOfx6GpCJHtCRxUA2TaJxbHBPmt8AFbY5T6mec0 pnpSvVoevK98Fo9Y7mkImcHOTfpnXb8bh1O1uVXiJv5Voaey3zNbeOZ976zjqL3heV qyOGYloDmu8j75LG/AEkilOAokL7NxLBEHGStWN065+jzpvOsREi/uA9KlURYKY1Dm eM6wktQ9ce7mUGpT7h6+mqQsLE1OfnCgb24rnWz+DbMrPz2DvsFZ9fuzhp94HJDw9Y o7M0+NMfHXpYF1/iec34epetlmqqblvv3oxZqreh7iA06fLjXXdSkVtXKqZzeU7GW2 B4FcFh50RkD7A== From: Michael Ellerman To: Linus Torvalds , Uros Bizjak , Catalin Marinas , Will Deacon , Russell King , Thomas Bogendoerfer , Heiko Carstens Cc: the arch/x86 maintainers , Linux Kernel Mailing List , Peter Zijlstra , Thomas Gleixner , Waiman.Long@hp.com, Paul McKenney , linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH 1/2] locking/lockref: Use try_cmpxchg64 in CMPXCHG_LOOP macro In-Reply-To: Date: Thu, 26 May 2022 22:14:59 +1000 Message-ID: <871qwgmqws.fsf@mpe.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Linus Torvalds writes: > On Wed, May 25, 2022 at 7:40 AM Uros Bizjak wrote: >> >> Use try_cmpxchg64 instead of cmpxchg64 in CMPXCHG_LOOP macro. >> x86 CMPXCHG instruction returns success in ZF flag, so this >> change saves a compare after cmpxchg (and related move instruction >> in front of cmpxchg). The main loop of lockref_get improves from: > > Ack on this one regardless of the 32-bit x86 question. > > HOWEVER. > > I'd like other architectures to pipe up too, because I think right now > x86 is the only one that implements that "arch_try_cmpxchg()" family > of operations natively, and I think the generic fallback for when it > is missing might be kind of nasty. > > Maybe it ends up generating ok code, but it's also possible that it > just didn't matter when it was only used in one place in the > scheduler. This patch seems to generate slightly *better* code on powerpc. I see one register-to-register move that gets shifted slightly later, so that it's skipped on the path that returns directly via the SUCCESS case. So LGTM. > The lockref_get() case can be quite hot under some loads, it would be > sad if this made other architectures worse. Do you know of a benchmark that shows it up? I tried a few things but couldn't get lockref_get() to count for more than 1-2%. cheers